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DNA SEQUENCING BY MASS SPECTROMETRY 

Efilatsd Applications 

This application is a continuation-in-part of U.S. Application Serial Number 
08/617,010, which is a continuation-in-part of U.S. Application Serial Number 
08/178,216, which issued as U.S. Patent No. 5,547,835, and which itself is a 
continuation-in-part of U.S. Application Serial Number 08/001,323 filed January 7, 1993, 
which is now abandoned. The contents of all related applications are incorporated herein 
by reference. 

Background of the Invention 

Since the genetic information is represented by the sequence of the four 
DNA building blocks deoxyadenosine- (dpA), deoxyguanosine- (dpG), deoxycytidine- 
(dpC) and deoxythymidine-5-phosphate (dpT), DNA sequencing is one of the most 
fundamental technologies in molecular biology and the life sciences in general. The ease 
and the rate by which DNA sequences can be obtained greatly affects related 
technologies such as development and production of new therapeutic agents and new and 
useful varieties of plants and microorganisms via recombinant DNA technology. In 
particular, unraveling the DNA sequence helps in understanding human pathological 
conditions including genetic disorders, cancer and AIDS. ^ In some cases, very subtle 
differences such as a one nucleotide deletion, addition or substitution can create serious, 
in some cases even fatal, consequences. Recently, DNA sequencing has become the core 
technology of the Human Genome Sequencing Project (e.g., J.E. Bishop and M 
Waldholz, 1991, Genome- The Storv of th e Most Astonishing Scientific Adventure of 
Our Time - The Attempt to Map All the Genes in the Human Body. Simon & Schuster, 
New York). Knowledge of the complete human genome DNA sequence will certainly 
help to understand, to diagnose, to prevent and to treat human diseases. To be able to 
tackle successfully the determination of the approximately 3 billion base pairs of the 
human genome in a reasonable time frame and in an economical way, rapid, reliable. 
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sensitive and inexpensive methods need to be developed, which also offer the possibility 
of automation. The present invention provides such a technology. 

Recent reviews of today's methods together with future directions and 
trends are given by Barrell (The FASEB Journal 5, 40-45 (1991)), and Trainor (AnaL 
ChfillL £2, 418-26 (1990)). 

Currently, DN A sequencing is performed by either the chemical degradation 
method of Maxam and Gilbert (Methods in Enzvmologv 65 499-560 (1980)) or the 
enzymatic dideoxynucleotide termination method of Sanger et al fProc Natl Acad Sri 
USA 24, 5463-67 (1977)). In the chemical method, base specific modifications result in 
a base specific cleavage of the radioactive or fluorescently labeled DNA fragment. With 
the four separate base specific cleavage reactions, four sets of nested fragments are 
produced which are separated according to length by polyacrylamide gel electrophoresis 
(PAGE). After autoradiography, the sequence can be read directly since each band 
(fragment) in the gel originates from a base specific cleavage event. Thus, the fragment 
lengths in the four "ladders" directly translate into a specific position in the DNA 
sequence. 

In the enzymatic chain termination method, the four base specific sets of 
DNA fragments are formed by starting with a primer/template system elongating the 
primer into the unknown DNA sequence area and thereby copying the template and 
synthesizing a complementary strand by DNA polymerases, such as Klenow fragment of 
£ coli DNA polymerase I, a DNA polymerase from Thermits aquaticus, Taq DNA 
polymerase, or a modified T7 DNA polymerase, Sequenase (Tabor et al. 9 Proc Natl 
AcadgjfcfiL USA 84. 4767-4771 (1987)), in the presence of chain-tenninating reagents. 
Here, the chain-terminating event is achieved by incorporating into the four separate 
reaction mixtures in addition to the four normal deoxynucleoside triphosphates, dATP, 
dGTP, dTTP and dCTP, only one of the chain-terminating dideoxynucleoside 
triphosphates, ddATP, ddGTP, ddTTP or ddCTP, respectively, in a limiting small 
concentration. The four sets of resulting fragments produce, after electrophoresis, four 
base specific ladders from which the DNA sequence can be determined. 

A recent modification of the Sanger sequencing strategy involves the 
degradation of phosphorothioate-containing DNA fragments obtained by using alpha-thio 
dNTP instead of the normally used ddNTPs during the primer extension reaction 
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mediated by DNA polymerase (Labeit et aL. DNA £, 173-177 (1986); Amersham, PCT- 
Application GB86/00349; Eckstein et al. 9 Nucleic Acids Res. J£, 9947 (1988)). Here, 
the four sets of base-specific sequencing ladders are obtained by limited digestion with 
exonuclease III or snake venom phosphodiesterase, subsequent separation on PAGE and 
visualization by radioisotopic labeling of either the primer or one of the dNTPs. In a 
further modification, the base-specific cleavage is achieved by alkylating the sulphur atom 
in the modified phosphodiester bond followed by a heat treatment (Max-Planck- 
Gesellschaft, DE 3930312 Al) ; Both methods can be combined with the amplification of 
the DNA via the Polymerase Chain Reaction (PCR). 

On the upfront end, the DNA to be sequenced has to be fragmented into 
sequencable pieces of currently not more than 500 to 1000 nucleotides. Starting from a 
genome, this is a multi-step process involving cloning and subcioning steps using different 
and appropriate cloning vectors such as YAC, cosmids, plasmids and Ml 3 vectors 
(Sambrook et aL % Molecular C loning- A Laboratory Manual. Cold Spring Harbor 
Laboratory Press, 1989). Finally, for Sanger sequencing, the fragments of about 500 to 
1000 base pairs are integrated into a specific restriction site of the replicative form I (RF 
1) of a derivative of the M13 bacteriophage (Vieria and Messing, GenslS, 259 (1982)) 
and then the double-stranded form is transformed to the single-stranded circular form to 
serve as a template for the Sanger sequencing process having a binding site for a 
universal primer obtained by chemical DNA synthesis (Sinha, Biernat, McManus and 
K6ster. Nucleic Acids Res 12. 4539-57 (1984); U.S. Patent No. 4725677 upstream of 
the restriction site into which the unknown ON A fragment has been inserted. Under 
specific conditions, unknown DNA sequences integrated into supercoiled double- 
stranded plasmid DNA can be sequenced directly by the Sanger method (Chen and 
Seeburg, DNA 4, 165-170 (1985)) and Urn et aL, Gene Anal Techn £, 32-39 (1988), 
and, with the Polymerase Chain Reaction (PCR) H>CR Protocols- A Guide to Methods 
and Applications Innis et al. 9 editors, Academic Press, San Diego (1990)) cloning or 
subcioning steps could be omitted by directly sequencing off chromosomal DNA by first 
amplifying the DNA segment by PCR and thai applying the Sanger sequencing method 
(Innis et aL % Proc Natl Acad Set USA ££, 9436-9440 (1988)). In this case, however, 
the DNA sequence in the interested region most be known at least to the extent to bind a 
sequencing primer. 
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In order to be able to read the sequence from PAGE, detectable labels have 

to be used in either the primer (very often at the 5*-end) or in one of the deoxynucleoside 

32 33 35 

triphosphates, dNTP. Using radioisotopes such as P, P, or S is still the most 

frequently used technique. After PAGE, the gels are exposed to X-ray films and silver 

5 grain exposure is analyzed. The use of radioisotopic labeling creates several problems. 

Most labels useful for autoradiographic detection of sequencing fragements have 

relatively short half-lives which can limit the useful time of the labels. The emission high 

32 

energy beta radiation, particularly from P, can lead to breakdown of the products via 
radiolysis so that the sample should be used very quickly after labeling, in addition, high 

10 energy radiation can also cause a deterioration of band sharpness by scattering. Some of 
these problems can be reduced by using the less energetic isotopes such as 33 P or 35 S 
(see, e.g.. Ornstein et aL % Biotechniques 3 476 (1985)). Here, however, longer exposure 
times have to be tolerated. Above all, the use of radioisotopes poses significant health 
risks to the experimentalist and, in heavy sequencing projects, decontamination and 

IS handling the radioactive waste are other severe problems and burdens! 

In response to the above mentioned problems related to the use of 
radioactive labels, non-radioactive labeling techniques have been explored and, in recent 
years, integrated into partly automated DNA sequencing procedures. All these 
improvements utilize the Sanger sequencing strategy. The fluorescent label can be tagged 

20 to the primer (Smith et ai. Nature 221, 674-679 (1986) and EPO Patent No. 

87300998.9; Du Pont De Nemours EPO Application No. 0359225; Ansorge et al. L 
ftiflfihrm Fiophvs Methods XL 325-32 (1986)) or to the chain-terminating 
didfioxynucioside triphosphates (Prober et al. Science 238 . 336-41 (1987); Applied 
Biosystems, PCT Application WO 91/05060). Based on either labeling the primer or the 

25 ddNTP, systems have been developed by Applied Biosystems (Smith et a/., Science 23 S T 
G89 (1987); U.S. Patent Nos. 570973 and 689013), Du Pont De Nemours (Prober et al. 
SfflfinCfi2i&, 336-341 (1987); U.S. Patents Nos. 881372 and 57566), Pharmacia-LKB 
(Ansorge et al Nucleic Acids Res 1£, 4593-4602 (1987) and EMBL Patent Application 
DE P3724442 and P3805808. 1) and Hitachi (JP 1-90844 and DE 401 1991 Al). A 

30 somewhat similar approach was developed by Brumbaugh et al. (Proc Natl Set USA 
S5» 5610-14 (1988) and U.S. Patent No. 4,729,947). An improved method for the Du 
Pont system using two electrophoretic lanes with two different specific labels per lane is 
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described (PCT Application WO92/02635). A different approach uses fluorescently 
labeled avidin and biotin labeled primers. Here, the sequencing ladders ending with biotin 
are reacted during electrophoresis with the labeled avidin which results in the detection of 
the individual sequencing bands (Brumbaugh et al y U.S. Patent No. 594676). 
5 More recently even more sensitive non-radioactive labeling techniques for 

DNA using chemiluminescence triggerable and amplifyable by enzymes have been 
developed (Beck, OTCeefe, CouU and K6ster, Mudgifi Acids Res. 12, 51 15-5123 (1989) 
and Beck and Roster, AnaL CMem 2258,2210 (1990)). .These labeling methods were 
combined with multiplex DNA sequencing (Church et aL Science 240, 185-188 (1988) to 
10 provide for a strategy aimed at high throughput DNA sequencing (Koster et aL , 

Nucleic Acids Re* Symposium Ser No 24. 318-321 (1991), University of Utah, PCT 
Application No. WO 90/15883); this strategy still suffers from the disadvantage of being 
very laborious and difficult to automate. 

In an attempt to simplify DNA sequencing, solid supports have been 
1 5 introduced. In most cases published so far, the template strand for sequencing (with or 
without PGR amplification) is immobilized on a solid support most frequently utilizing 
the strong biotin-avidin/streptavidin interaction (Orion- Yhtyma Oy, U.S. Patent No. 
277643; M. Uhlene/a/. Nnriric Adds Res . 16. 3025-38 (1988); Cemu Bioteknik, PCT 
Application No. WO 89/09282 and Medical Research Council, GB, PCT Application No. 
20 WO 92/03575). The primer extension products synthesized on the immobilized template 
strand are purified of enzymes, other sequencing reagents and by-products by a washing 
step and then released under denaturing conditions by loosing theJi^drogen bonds 
between the Watson-Crick base pairs and subjected to PAGE separation. In a different 
approach, the primer extension products (not the template) from a DNA sequencing 
25 reaction are bound to a solid support via biotin/avidin (Du Pont De~Nemours, PCT 
Application WO 91/1 1533). In contrast to the above mentioned methods, here, the 
interaction between biotin and avidin is overcome by employing denaturing conditions 
(formamide/EDTA) to release the primer extension products of the sequencing reaction 
from the solid support for PAGE separation. As solid supports, beads, (e.g., magnetic 
30 beads (Dynabeads) and Sepharose beadsX filters, capillaries, plastic dipsticks (e.g., 
polystyrene strips) and microther wells are being proposed. 
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All methods discussed so far have one central step in common: 
polyacrylamide gel electrophoresis (PAGE). In many instances, this represents a major 
drawback and limitation for each of these methods. Preparing a homogeneous gel by 
polymerization, loading of the samples, the electrophoresis hsel£ detection of the 
5 sequence pattern (e.g., by autoradiography), removing the gel and cleaning the glass 
plates to prepare another gel are very laborious and time-consuming procedures. 
Moreover, the whole process is error-prone, difficult to automate, and, in order to 
improve repr od ucibi lity.and reliability, highly trained and skilled personnel are required. 
In the case of radioactive labeling, autoradiography itself can consume from hours to 

10 days. In the case of fluorescent labeling, at least the detection of the sequencing bands is 
being performed automatically when using the laser-scanning devices integrated into 
commercial available DNA sequencers. One problem related to the fluorescent labeling is 
the influence of the four different base-specific fluorescent tags on the mobility of the 
fragments during electrophoresis and a possible overlap in the spectral bandwidth of the 

15 four specific dyes reducing the discriminating power between neighboring bands, hence, 
increasing the probability of sequence ambiguities. Artifacts are also produced by base- 
specific interactions with the polyacrylamide gel matrix (Frank and Koster, Nucleic 
AtidS Res. 6, 2069 (1979)) and by the formation of secondary structures which result in 
"band compressions" and hence do not allow one to read the sequence. This problem 

20 has, in part, been overcome by using 7-deazadeoxyguanosine triphosphates (Ban* et al. y 
Biotechniques 4, 428 (1986)). However, the reasons for some artifacts and conspicuous 
bands are still under investigatiofwmd need further improvement of the gel 
electrophoretic procedure. . 

A recent innovation in electrophoresis is capillary zone electrophoresis 

25 (CZE) (Jorgenson et ai % J Chromatography 2£2, 337 (1986); Gesteland et ai, 

Nucleic Acids Rg* !£, 1415-1419 (1990)) which, compared to slab gel electrophoresis 
(PAGE), significantly increases the resolution of the separation, reduces the time for an 
electrophoretic run and allows the analysis of very small samples. Here, however, other 
problems arise due to the miniaturization of the whole system such as wall effects and the 

30 necessity of highly sensitive on-line detection methods. Compared to PAGE, another 

drawback is created by the fact that CZE is only a "one-lane" process, whereas in PAGE 
samples in multiple lanes can be electrophoresed simultaneously. 
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Due to the severe limitations and problems related to having PAGE as an 
integral and central part in the standard DNA sequencing protocol, several methods have 
been proposed to do DNA sequencing without an electrophoretic step. One approach 
calls for hybridization or fragmentation sequencing (Bains, Biotechnology 10, 757-58 
(1992) and Mirzabekov et al. 9 FEBS Letters 256, 1 18-122 (1989)) utilizing the specific 
hybridization of known short oligonucleotides (e.g., octadeoxynucleotides which gives 
65,536 different sequences) to a complementary DNA sequence_.Positive-hybridization 
reveals a short stretch of the unknown sequence. Repeating this process by performing 
hybridizations with all possible octadeoxynucleotides should theoretically determine the 
sequence. In a completely different approach, rapid sequencing of DNA is done by 
unilaterally degrading one single, immobilized DNA fragment by an exonuciease in a 
moving flow stream and detecting the cleaved nucleotides by their specific fluorescent tag 
via laser excitation (Jett etaL y J Biomolecular Structure A Dynamics £ 301-309, 
(1989); United States Department of Energy, PCT Application No. WO 89/03432). In 
another system proposed by Hyman ( Anal Biochem. 124, 423-436 (1988)), the 
pyrophosphate generated when the correct nucleotide is attached to the growing chain on 
a primer-template system is used to determine the DNA sequence. The enzymes used 
and the DNA are held in place by solid phases (DEAE-Sepharose and Sepharose) either 
by ionic interactions or by covalent attachment. In a continuous flow-through system, the 
amount of pyrophosphate is determined via bioluminescence (luciferase). A synthesis 
approach to DNA sequencing is also used by Tsien et al (PCT Application No- WO 
91/06678). Here, the incoming dNTFs are protected at the 3'-end by variou&blocking 
groups such as acetyl or phosphate groups and are removed before the next elongation 
step, which makes this process very slow compared to standard sequencing methods. 
The template DNA is immobilized on a polymer support. To detect incorporation, a 
fluorescent or radioactive label is additionally incorporated into the modified dNTFs. 
The same patent application also describes an apparatus designed to automate the 
process. 

Mass spectrometry, in general, provides a means of "weighing^ individual 
molecules by ionizing the molecules in vacuo and making them "fly" by volatilization. 
Under the influence of combinations of electric and magnetic fields, the ions follow 
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trajectories depending on their individual mass (m) and charge (z). In the range of 
molecules with low molecular weight, mass spectrometry has long been part of the 
routine physical-organic repertoire for analysis and characterization of organic molecules 
by the determination of the mass of the parent molecular ion. In addition, by arranging 
5 collisions of this parent molecular ion with other particles (e.g., argon atoms), the 

molecular ion is fragmented forming secondary ions by the so-called collision induced 
dissociation (CID). The fragmentation pattern/pathway very often allows the derivation 
of detailed structural information. Many applications of mass spectrometric methods in 
the known in the art, particularly in biosciences, and can be found summarized in 

10 Methods in Enzvmologv. Vol. 193: "Mass Spectrometry* 1 (J.A. McCloskey, editor), 
1990, Academic Press, New York. 

Due to the apparent analytical advantages of mass spectrometry in 
providing high detection sensitivity, accuracy of mass measurements, detailed structural 
information by CED in conjunction with an MS/MS configuration and speed, as well as 

1 5 on-line data transfer to a computer, there has been considerable interest in the use of 
mass spectrometry for the structural analysis of nucleic acids. Recent reviews 
summarizing this field include K. R Schram, "Mass Spectrometry of Nucleic Acid 
Components, Biomedical Applications of Mass Spectrometry" 24, 203-287 (1990); and 
P.F. Crain, "Mass Spectrometric Techniques in Nucleic Acid Research," Mass 

20 Spectrometry Reviews 2 r 505-554 (1990). The biggest hurdle to applying mass 

spectrometry to nucleio<M«ds is the difficulty of volatilizing these very polar biopolymers. 
Therefore, " sequencing!' ■ has been limited to low molecular weight synthetic 
oligonucleotides by determining the mass of the parent molecular ion and through this, 
confirming the already teiown sequence, or alternatively, confirming the known sequence 

25 through the generation of secondary ions (fragment ions) via CED in an MS/MS 

configuration utilizing, in particular, for the ionization and volatilization, the method of 
fast atomic bombardment (FAB mass spectrometry) or plasma desorption (PD mass 
spectrometry). As an example, the application of FAB to the analysis of protected 
dimeric blocks for chemical synthesis of oligodeoxynucleotides has been described 

30 (K6ster et al Biomedical E nvironmental Mass Spectrometry 14, 1 1 l-l 16 (1987)). 
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Two more recent ionization/desorption techniques are electrospray/ionspray 
(ES) and matrix-assisted laser desorption/ionization (MALDI). ES mass spectrometry 
has been introduced by Fenn et at f J. Phvs Chem £g, 445 1-59 (1984); PCT Application 
No. WO 90/14148) and current applications are summarized in recent review articles 
(R.D. Smith e/ a/., Anal Chem 62, 882-89 (1990) and B. Ardrey, Electrospray Mass 
Spectrometry, Spectroscopy Europe, 4, 10-18 (1992)). The molecular weights of the 
tetradecanucleotide d(CATGCCATGGCATG) (SEQ ID NO: 1) (Covey et al "The 
Determination of Protein, Oligonucleotide and Peptide Molecular Weights by Ionspray 
Mass Spectrometry," Rapid Communications in Mass Spectrometry 2, 249-256 (1988)), 
of the 21-mer d(AAATTGTGCACATCCTGCAGC) (SEQ ID NO:2) and without giving 
details of that of a tRNA with 76 nucleotides (Methods in Enrvmnlngv 193, "Mass 
Spectrometry- (McCloskey, editor), p. 425, 1990, Academic Press, New York) have 
been published. As a mass analyzer, a quadrupole is most frequently used. The 
determination of molecular weights in femtomole amounts of sample is very accurate due 
to the presence of multiple ion peaks which all could be used for the mass calculation 

MALDI mass spectrometry, in contrast, can be particularly attractive when 
a time-of-flight (TOF) configuration is used as a mass analyzer. The MALDI-TOF mass 
spectrometry has been introduced by Hillenkamp et al ("Matrix Assisted UV-Laser 
Desorption/ionization: A New Approach to Mass Spectrometry of Large Biomolecules," 
Biological Mass Spectrometry (Burlingame and McCloskey, editors), Elsevier Science 
Publishers, Amsterdam, pp. 49-60, 1990.) Since, in most cases, no multiple molecular -=r 
ion peaks are produced with this technique, the mass spectra, in principle, look simpler 
compared to ES mass spectrometry. Although DNA molecules up to a molecular weight 
of 410,000 daltons could be desorbed and volatilized (Williams et al. % "Volatilization of 
High Molecular Weight DNA by Pulsed Laser Ablation of Frozen Aqueous Solutions/ 
Science, 246, 1585-87 (1989)), this technique has so far only been used to determine the 
molecular weights of relatively small oligonucleotides of known sequence, e .g., 
oligothymidylic acids up to 18 nucleotides (Huth-Fehre et al., "Matrix- Assisted Laser 
Desorption Mass Spectrometry of Oligodeoxythyniidylic Acids," 
Rapid Communications in Mass Spectrometry & 209-13 (1992)) and a double-stranded 
DNA of 28 base pairs (Williams et al, Time-of-Flight Mass Spectrometry of Nucleic 
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Acids by Laser Ablation and Ionization from a Frozen Aqueous Matrix," Rapid 
Communications in Mass Spectrometry. ± 348-351 (1990)). In one publication (Huth- 
Fehre et aL, 1992 , supra), it was shown that a mixture of all the oligothymidylic acids 
from n=12 to n=18 nucleotides could be resolved. 

5 In U.S. Patent No. 5,064,754, RNA transcripts extended by DNA both of 

which are complementary to the DNA to be sequenced are prepared by incorporating 
NTP's, dNTFs and, as terminating nucleotides, ddNTFs which are substituted at the 5 - 
position of the sugar moiety with one or a combination of the isotopes *^C, *^C, * 4 C, 
*H, 2 H, 3 H, l6 0, 17 0 and 18 0. The polynucleotides obtained are degraded to 3'- 

10 nucleotides, cleaved at the N-glycosidic linkage and the isotopically labeled 5 - 

functionality removed by periodate oxidation and the resulting formaldehyde species 
determined by mass spectrometry. A specific combination of isotopes serves to 
discriminate base-specifically between internal nucleotides originating from the 
incorporation of NTFs and dNTP's and terminal nucleotides caused by Unking ddNTFs 

15 to the end of the polynucleotide chain. A series of RNA/DNA fragments is produced, 
and in one embodiment, separated by electrophoresis, and, with the aid of the so-called 
matrix method of analysis, the sequence is deduced. 

In Japanese Patent No. 59-131909, an instrument is described which detects 
nucleic acid fragments separated either by electrophoresis, liquid chromatography or high 

20 speed gel filtration. Mass spectrometric detection is achieved by incorporating into the 
nucleic acids atoms which normally do not occur in DNA such as S, Br, I or Ag, Au, Pt, 
Os, Hg. The method, however, is not appliecPto sequencing of DNA using the Sanger 
method. In particular, it does not propose a base-specific correlation of such elements to 
an individual ddNTP. 

25 PCT Application No. WO 89/12694 (Brennan et aL 9 Effifi SEIEJPl Soc 

Opt Eng 1206, fNew Techno! Cytnm Mol Biol V pp. 60-77 (1990); and Brennan, 

U.S. Patent No. 5,003,059) employs the Sanger methodology for DNA sequencing by 

32 33„ 34 0 36 c 35 37 
using a combination of either the four stable isotopes S, S, S, S or CI, CI, 

79 81 

Br, Br to specifically label the chain-terminating ddNTFs. The sulfur isotopes can 
30 . be located either in the base or at the alpha-position of the triphosphate moiety whereas 
the halogen isotopes are located either at the base or at the 3-position of the sugar ring. 
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The sequencing reaction mixtures are separated by an electrophoretic technique such as 
CZE, transferred to a combustion unit in which the sulfur isotopes of the incorporated 
ddNTP's are transformed at about 9O0°C in an oxygen atmosphere. The S0 2 generated 
with masses of 64, 65, 66 or 68 is determined on-line by mass spectrometry using, e.g., as 
mass analyzer, a quadrupoie with a single ion-multiplier to detect the ion current. 

A similar approach is proposed in U.S. Patent No. 5,002,868 (Jacobson et 
aL y PrOC, SPIE-lnt, SOC, Opt, Rng 1435, (Pot. Methods I Jhrasnsiriv e Detect Anal 
Tech. Appl.), 26-35 (1991)) using Sanger sequencing with four ddNTFs specifically 
substituted at the alpha-position of the triphosphate moiety with one of the four stable 
sulfur isotopes as described above and subsequent separation of the four sets of nested 
sequences by tube gel electrophoresis. The only difference is the use of resonance 
ionization spectroscopy (RIS) in conjunction with a magnetic sector mass analyzer as 
disclosed in U.S. Patent No. 4,442,354 to detect the sulfur isotopes corresponding to the 
specific nucleotide terminators, and by this, allowing the assignment of the DNA 
15 sequence. 

EPO Patent Applications No. 0360676 Al and 0360677 Al also describe 

Sanger sequencing using stable isotope substitutions in the ddNTP's such as D, 
15 vi 17 ^ 18^ 32 0 33 0 34 36^ 19 35^, 37_ 79 81 127 
N, O, O, S, S, S, S, F, CI, CI, Br, Br and I or functional 

groups such as CF3 or Si(CH 3 )3 at the base, the sugar or the alpha position of the 

20 triphosphate moiety according to chemical functionality. The Sanger sequencing reaction 

mixtures are separated by tube gel electrophoresis. The effluent is converted into an 

aerosol by the electrospray/thermospray nebulizer method and then atomized and ionized 

by a hot plasma (7000 to 8000°K) and analyzed by a simple mass analyzer. An 

instrument is proposed which enables one to automate the analysis of the Sanger 

25 sequencing reaction mixture consisting of tube electrophoresis, a nebulizer and a mass 

analyzer. 

The application of mass spectrometry to perform DNA sequencing by the 
hybridization/fragment method (see above) has been recently suggested (Bains, "DNA 
Sequencing by Mass Spectrometry: Outline of a Potential Future Application/ 
30 Chiroiaoggi 2, 13-16(1991)). 
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Summary of the Invention 

The invention describes a new method to sequence DNA. The 
improvements over the existing DNA sequencing technologies include high speed, high 
throughput, no required electrophoresis (and, thus, no gel reading artifacts due to the 
5 complete absence of an electrophoretic step), and no costly reagents involving various 
substitutions with stable isotopes. The invention utilizes the Sanger sequencing strategy 
and assembles the sequence information by analysis of the nested fragments obtained by 
base-specific chain termination via their different molecular masses using mass 
spectrometry, for example, MALDI or ES mass spectrometry. A further increase in 
10 throughput can be obtained by introducing mass modifications in the oligonucleotide 
primer, the chain-terminating nucleoside triphosphates and/or the chain-elongating 
nucleoside triphosphates, as well as using integrated tag sequences which allow 
multiplexing by hybridization of tag specific probes with mass differentiated molecular 
weights. 

15 

Brief Description of the FIGURES 

FIGURE 1 is a representation of a process to generate the samples to be 
analyzed by mass spectrometry. This process entails insertion of a DNA fragment of 
unknown sequence into a cloning vector such as derivatives of Ml 3, pUC or phagemids; 

20 transforming the double-stranded form into the single-stranded form; performing the four 
Sanger sequencing reactions; linking the base-specifically terminated nested fragment 
family temporarily to a solid support; removing*hy.a washing step all by-products, 
conditioning the nested DNA or RNA fragments by, for example, cation-ion exchange or 
modification reagent and presenting the immobilized nested fragments either directly to 

25 mass spectrometric analysis or cleaving the purified fragment family off the support and 
evaporating the cleavage reagent. 

FIGURE 2A shows the Sanger sequencing products using ddTTP as 
terminating deoxynucleoside triphosphate of a hypothetical DNA fragment of 50 
nucleotides (SEQ ID NO:3) in length with approximately equally balanced base 

30 composition. The molecular masses of the various chain terminated fragments are given. 
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FIGURE 2B shows an idealized mass spectrum of such a DN A fragment 

mixture. 

FIGURES 3 A and 3B show, in analogy to FIGURES 2 A and 2B, data for 
the same model sequence (SEQ ED NO:3) with ddATP as chain terminator. 

FIGURES 4A and 4B show data, analogous to FIGURES 2A and 2B 
when ddGTP is used as a chain terminator for the same model sequence (SEQ ID NO:3). 

FIGURES 5 A and SB illustrate the results obtained where chain 
termination is performed with ddCTP as a chain terminator, in a similar way as shown in 
FIGURES 2 A and 2B for the same model sequence (SEQ ID NO:3). 

FIGURE 6 summarizes the results of FIGURES 2A to 5B, showing the 
correlation of molecular weights of the nested four fragment families to the DN A 
sequence (SEQ ID NO:3). 

FIGURE 7 illustrates the general structure of mass-modified sequencing 
nucleic acid primers or tag sequencing probes for either Sanger DNA or Sanger RNA 
sequencing. 

FIGURE 8 shows the general structure for the mass-modified 
triphosphates for either Sanger DNA or Sanger RNA sequencing. General formulas of 
the chain-elongating and the chain-terminating nucleoside triphosphates are 
demonstrated. 

FIGURE 9 outlines various linking chemistries (X) with either 
polyethylene glycol or terminally monoalkylated polyethylene glycol (R) as an example. 

_ FIGURE 10 illustrates similar linking chemistries as shown in FIGURE 8 

and depicts various mass modifying moieties (R). 

FIGURE 1 1 outlines how multiplex mass spectrometric sequencing can 
work using the mass-modified nucleic acid primer (UP). 

FIGURE 12 shows the process of multiplex mass spectrometric 
sequencing employing mass-modified chain-elongating and/or terminating nucleoside 
triphosphates. 

FIGURE 13 shows multiplex mass spectrometric sequencing by involving 
the hybridization of mass-modified tag sequence specific probes. 



WO 97/37041 PCT/US97/04394 

- 14 - 

FIGURE 14 shows a MALDI-TOF spectrum of a mixture of 

oligothymidylic acids, d(pT) i2-18* 

FIGURE 15 shows a superposition of MALDI-TOF spectra of the 50-mer 
d(TAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT) 
5 (SEQ ID NO:3) (500 fmol) and dT(pdT) 99 (500 fmol). 

FIGURE 16 shows the MALDI-TOF spectra of all 13 DNA sequences 
representing the nested dT-terminated fragments of the Sanger DNA sequencing 
simulation of Figure 2, 500 fmol each. 

FIGURE 17 shows the superposition of the spectra of FIGURE 16. The 
10 two panels show two different scales and the spectra analyzed at that scale. 

FIGURE 18 shows the superimposed MALDI-TOF spectra from MALDI- 
MS analysts of mass-modified oligonucleotides as described in Example 21. 

FIGURE 19 illustrates various linking chemistries between the solid 
support (P) and the nucleic acid primer (NA) through a strong electrostatic interaction. 
15 FIGURE 20 illustrates various linking chemistries between the solid 

support (P) and the nucleic acid primer (NA) through a charge transfer complex of a 
charge transfer acceptor (A) and a charge transfer donor (D). 

FIGURE 21 illustrates various linking chemistries between the solid 
support (P) and the nucleic acid primer (NA) through a stable organic radical. 
20 ~ FIGURE 22 illustrates a possible linking chemistry between the solid 

sdpport (P) and the nucleic acid primer (NA) through Watson-Criek base pairing. 

FIGURE 23 illustrates linking the solid support (P> and the nucleic acid 
primer (NA) through a photolytically cleavable bond. 

FIGURE 24 shows the portion of the sequence of pRFcl DNA, which 
25 was used as template for PCR amplification of unmodified and 7-deazapurine containing 
99-mer and 200-mer nucleic acids as well as the sequences of the 1 9-mer primers and the 
two 18-mer reverse primers. 

FIGURE 25 shows the portion of the nucleotide sequence of M13mpl8 
RFI DNA, which was used for PCR amplification of unmodified and 7-deazapurine 
30 containing 103-mer nucleic acids. Also shown are nucleotide sequences of the 17-mer 
primers used in the PCR. 
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FIGURE 26 shows the result of a potyacrylamide gel electrophoresis of 
PCR products purified and concentrated for MALDI-TOF MS analysis. M: chain length 
marker, lane 1 : 7-deazapurine containing 99-mer PCR product, lane 2: unmodified 99- 
mer, lane 3: 7-deazapurine containing 103-mer and lane 4: unmodified 103-mer PCR 
5 product. 

FIGURE 27: an autoradiogram of polyacrylamide gel electrophoresis of 
PCR reactions carried out with 5 -[ 32 P]-labeled primers 1 and 4. Lanes 1 and 2: 
unmodified and 7 -deazapurine modified 103-mer PCR product (53321 and 23520 
counts), lanes 3 and 4: unmodified and 7-deazapurine modified 200-mer (71 123 and 
10 39582 counts) and lanes 5 and 6: unmodified and 7-deazapurine modified 99-mer 
(173216 and 94400 counts). 

FIGURE 28: a) MALDI-TOF mass spectrum of the unmodified 103-mer 
PCR products (sum of twelve single shot spectra). The mean value of the masses 
calculated for the two single strands (3 1 768 u and 3 1 759 u) is 3 1 763 u. Mass resolution: 
15 18. b) MALDI-TOF mass spectrum of 7-deazapurine containing 103-mer PCR product 
(sum of three single shot spectra). The mean value of the masses calculated for the two 
single strands (3 1727 u and 3 1719 u) is 31723 u. Mass resolution: 67. 

FIGURE 29: a) MALDI-TOF mass spectrum of the unmodified 99-mer 
PCR product (sum of twenty single shot spectra). Values of the masses calculated for the 
20 two single strands: 30261 u and 30794 u. b) MALDI-TOF mass spectrum of the 7- 

deazapurine containing 99-mer PCR product (sum of twelve single shot spectra). Values 
of the masses calculated for the two single strands: 30224 u and 30750 u. 

FIGURE 30: a) MALDI-TOF mass spectrum of the unmodified 200-mer 
PCR product (sum of 30 single shot spectra). The mean value of the masses calculated 
25 for the two single strands (61 873 u and 61 595 u) is 61 734 u. Mass resolution: 28. b) 

MALDI-TOF mass spectrum of 7-deazapurine containing 200-mer PCR product (sum of 
30 single shot spectra). The mean value of the masses calculated for the two single 
strands (61772 u and 615 14 u) is 61643 u. Mass resolution: 39. 

FIGURE 31: a) MALDI-TOF mass spectrum of 7-deazapurine containing 
30 100-mer PCR product with ribomodified primers. The mean value of the masses 

calculated for the two single strands (30529 u and 3 1095 u) is 30812 u. b) MALDI-TOF 
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mass spectrum of the PCR-product after hydroiytic primer-cleavage. The mean value of 
the masses calculated for the two single strands (25 1 04 u and 25229 u) is 25 1 67 u. The 
mean value of the cleaved primers (5437 u and 5918 u) is 5677 u. 

FIGURE 32 A-D shows the MALDI-TOF mass spectrum of the four 
sequencing ladders obtained from a 39-mer template (SEQ. ED. No. 13), which was 
immobilized to streptavidin beads via a 3* biotinyiation. A 14-rner primer (SEQ. ID. NO 
14) was used in the sequencing. 
■ FIGURE 33 shows a MALDI-TOF mass spectrum of a solid state 

sequencing of a 78-mer template (SEQ. ID. No. 15), which was immobilized to 
streptavidin beads via a 3' biotinyiation. A 18-mer primer (SEQ ID No. 16) and ddGTP 
were used in the sequencing. 

FIGURE 34 shows a scheme in which duplex DNA probes with single- 
stranded overhang capture specific DNA templates and also serve as primers for solid 
state sequencing. 

FIGURE 3 5 A-D shows MALDI-TOF mass spectra obtained from a 5* 
fluorescent labeled 23-mer (SEQ. ED. No. 19) annealed to an 3* biotinylated 18-mer 
(SEQ. ED. No. 20), leaving a 5-base overhang, which captured a 15-mer template (SEQ. 
ED. No. 21). 

FIGURE 36 shows a stacking flurogram of the same products obtained 
from the reaction described in FIGURE 35, but run on a conventional DNA sequencer. 

Detailed Description of the Invention 

This invention describes an improved method of sequencing DNA. In 
particular, this invention employs mass spectrometry to analyze the Sanger sequencing 
reaction mixtures. 

In Sanger sequencing, four families of chain-terminated fragments are 
obtained. The mass difference per nucleotide addition is 289. 19 for dpC, 3 13 .2 1 for dp A, 
329.21 for dpG and 304.2 for dpT, respectively. 

In one embodiment, through the separate determination of the molecular 
weights of the four base-specifically terminated fragment families, the DNA sequence can 
be assigned via superposition (e.g., interpolation) of the molecular weight peaks of the 
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four individual experiments. In another embodiment, the molecular weights of the four 
specifically terminated fragment families can be determined simultaneously by MS, either 
by mixing the products of all four reactions run in at least two separate reaction vessels 
fi e., all run separately, or two together, or three together) or by running one reaction 
having all four chain-terminating nucleotides (e.g., a reaction mixture comprising dTTP, 
ddTTP, dATP, ddATP, dCTP, ddCTP, dGTP, ddGTP) in one reaction vessel. By 
simultaneously analyzing all four base-specifically terminated reaction products, the 
molecular weight values have been, in effect, interpolated. Comparison of the mass 
difference measured between fragments with the known masses of each chain-terminating 
nucleotide allows the assignment of sequence to be carried out. In some instances, it may 
be desirable to mass modify, as discussed below, the chain-terminating nucleotides so as 
to expand the difference in molecular weight between each nucleotide. It will be apparent 
to those skilled in the art when mass-modification of the chain-terminating nucleotides is 
desirable and can depend, for instance, on the resolving ability of the particular 
spectrometer employed. By way of example, it may be desirable to produce four chain- 
terminating nucleotides, ddTTP, ddCTP 1 , ddATP 2 and ddGTP 3 where ddCTP 1 , 
2 3 

ddATP and ddGTP have each been mass-modified so as to have molecular weights 
resolvable from one another by the particular spectrometer being used. 

The terms chain-elongating nucleotides and chain-terminating nucleotides 
are well known in the art. For DNA, chain-elongating nucleotides include 
2*-deoxy ribonucleotides and chain-terminating nucleotides include ^ 7 
2\ 3-dideoxyribonucleotides. For RNA, chain-elongating nucleotides include 
ribonucelotides and chain-terminating nucleotides include 3'-deoxyribonucleotides. The 
term nucleotide is also well known in the art. For the purposes of this invention, 
nucleotides include nucleoside mono-, di-, and triphosphates. Nucleotides also include 
modified nucleotides such as phosphorothioate nucleotides. 

Since mass spectrometry is a serial method, in contrast to currently used 
slab gel electrophoresis which allows several samples to be processed in parallel, in 
another embodiment of this invention, a further improvement can be achieved by 
multiplex mass spectrometric DNA sequencing to allow simultaneous sequencing of more 
than one DNA or RNA fragment. As described in more detail below, the range of about 
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300 mass units between one nucleotide addition can be utilized by employing either mass- 
modified nucleic acid sequencing primers or chain-elongating and/or terminating 
nucleoside triphosphates so as to shift the molecular weight of the base-specifically 
terminated fragments of a particular DNA or RNA species being sequenced in a 
5 predetermined manner. For the first time, several sequencing reactions can be mass 
spectrometrically analyzed in parallel. In yet another embodiment of this invention, 
multiplex mass spectrometric DNA sequencing can be performed by mass modifying the 
fragment families through specific oligonucleotides (tag probes) which hybridize to 
specific tag sequences within each of the fragment families. In another embodiment, the 
10 tag probe can be covalently attached to the individual and specific tag sequence prior to 
mass spectrometry. 

Preferred mass spectrometer formats for use in the invention are matrix 
assisted laser desorption ionization (MALDI), electrospray (ES), ion cyclotron resonance 
(ICR) and Fourier Transform. For ES, the samples, dissolved in water or in a volatile 
1 5 buffer, are injected either continuously or discontinuously into an atmospheric pressure 
ionization interface (API) and then mass analyzed by a quadrupole. The generation of 
multiple ion peaks which can be obtained using ES mass spectrometry can increase the 
accuracy of the mass determination. Even more detailed information on the specific 
structure can be obtained using an MS/MS quadrupole configuration 
20 In MALDI mass spectrometry, various mass analyzers can be used, e.g., 

magnetic sector/magnetic deflection instruments in single or triple quadrupole mode 
(MS/MS), Fourier transform and time-of-flight (TOF) configurations as is known in the 
art of mass spectrometry. For the desorption/ionization process, numerous matrix/laser 
combinations can be used. Ion-trap and reflectron configurations can also be employed. 
25 In one embodiment of the invention, the molecular weight values of at 

least two base-specifically terminated fragments are determined concurrently using mass 
spectrometry. The molecular weight values of preferably at least five and more 
preferably at least ten base-specifically terminated fragments are determined by mass 
spectrometry. Also included in the invention are determinations of the molecular weight 
30 values of at least 20 base-specifically terminated fragments and at least 30 base- 
specifically terminated fragments. Further, the nested base-specifically terminated 
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fragraents in a specific set can be purified of all react ants and by-products but are not 
separated from one another. The entire set of nested base-specifically terminated 
fragments is analyzed concurrently and the molecular weight values are determined. At 
least two base-specifically terminated fragments are analyzed concurrently by mass 
spectrometry when the fragments are contained in the same sample. 

In general, the overall mass spectrometry DNA sequencing process will 
start with a library of. small genomic fragments obtained. after first randomly or 
specifically cutting the genomic DNA into large pieces which then, in several subcloning 
steps, are reduced in size and inserted into vectors like derivatives of M13 or pUC (e.g., 
M13mpl8 or M13mpl9) (see FIGURE 1). In a different approach, the fragments 
inserted in vectors, such as M13, are obtained via subcloning starting with a cDNA 
library. In yet another approach, the DNA fragments to be sequenced are generated by 
the polymerase chain reaction (e.g., Higuchi et al, "A General Method of in vitro 
Preparation and Mutagenesis of DNA Fragments: Study of Protein and DNA 
Interactions," Nucleic Acids Res . 16, 7351-67 (1988)). As is known in the art, Sanger 
sequencing can start from one nucleic acid primer (UP) binding to the plus-strand or from 
another nucleic acid primer binding to the opposite minus-strand. Thus, either the 
complementary sequence of both strands of a given unknown DNA sequence can be 
obtained (providing for reduction of ambiguity in the sequence determination) or the 
length of the sequence information obtainable Aom one clone can be extended by 
generating sequence information from both^adfeftf the unknown vector-inserted DNA 
fragment. ^ 

The nucleic acid primer carries, preferentially at the S'-end, a linking 
functionality, L, which can include a spacer otsufficient length and which can interact 
with a suitable functionality, on a solid support to form a reversible linkage such as a 
photocleavable bond. Since each of the four Sanger sequencing femilies starts with a 
nucleic acid primer (L-UP; FIGURE 1) this fragment family can be bound to the solid 
support by reacting with functional groups, L', on the surface of a solid support and then 
intensively washed to remove all buffer salts, triphosphates, enzymes, reaction by- 
products, etc. Furthermore, for mass spectrometric analysis, it can be of importance at 
this stage to exchange the cation at the phosphate backbone of the DNA fragments in 
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order to eliminate peak broadening due to a heterogeneity in the cations bound per 
nucleotide unit. Since the L-L' linkage is only of a temporary nature with the purpose to 
capture the nested Sanger DNA or RNA fragments to properly condition them for mass 
spectrometric analysis, there are different chemistries which can serve this purpose. In 
5 addition to the examples given in which the nested fragments are coupled covalently to 
the solid support, washed, and cleaved off the support for mass spectrometric analysis, 
— — the temporary linkage can be such that it is cleaved under the conditions of mass 

spectrometry, i.e., a photocleavable bond such as a charge transfer complex or a stable 
organic radical. Furthermore, the linkage can be formed with L* being a quaternary 

10 ammonium group (some examples are given in FIGURE 19). In this case, preferably, the 
surface of the solid support carries negative charges which repel the negatively charged 
nucleic acid backbone and thus facilitates desorption. Desorption will take place either 
by the heat created by the laser pulse and/or, depending on L, 1 by specific absorption of 
laser energy which is in resonance with the L* chromophore (see, e.g., examples given in 

15 FIGURE 19). The functionalities, L and L/ can also form a charge transfer complex and 
thereby form the temporary L-L 1 linkage. Various examples for appropriate 
functionalities with either acceptor or donator properties are depicted without limitation 
in FIGURE 20. Since in many cases the "charge-transfer band" can be determined by 
UV/vis spectrometry (see e.g. Or ganic Charge Transfer Complexes by R. Foster, 

20 Academic Press, 1969), the laser energy can be tuned to the corresponding energy of the 

ch»ge=transfer wavelength and, thus, a specific desorption off the solid support can be 

instated. Those skilled in the art will recognize that several combinations can serve this 
purpose and that the donor functionality can be either on the solid support or coupled to 
the-nested Sanger DNA/RNA fragments or vice versa. 

25 In yet another approach, the temporary linkage L-L* can be generated by 

homolytically forming relatively stable radicals as exemplified in FIGURE 21. In example 
4 of FIGURE 21, a combination of the approaches using charge-transfer complexes and 
stable organic radicals is shown. Here, the nested Sanger DNA/RNA fragments are 
captured via the formation of a charge transfer complex. Under the influence of the laser 

30 pulse, desorption (as discussed above) as well as ionization will take place at the radical 
position. In the other examples of FIGURE 21 under the influence of the laser pulse, the 
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L-L' linkage will be cleaved and the nested Sanger DNA/RNA fragments desorbed and 
subsequently ionized at the radical position formed. Those skilled in the art will 
recognize that other organic radicals can be selected and that, in relation to the 
dissociation energies needed to homolytically cleave the bond between them, a 
corresponding laser wavelength can be selected (see e.g. Reactive Mnl^ileg by C. 
Wentrup, John Wiley & Sons, 1984). In yet another approach, the nested Sanger 
DNA/RNA fragments are captured via Watson-Crick base pairing to a solid support- 
bound oligonucleotide complementary to either the sequence of the nucleic acid primer o 
the tag oligonucleotide sequence (see FIGURE 22). The duplex formed will be cleaved 
under the influence of the laser pulse and desorption can be initiated. The solid support- 
bound base sequence can be presented through natural oligoribo- or 
oligodeoxyribonucleotide as well as analogs (e.g. thio-modified phosphodiester or 
phosphotriester backbone) or employing oligonucleotide mimetics such as PNA analogs 
(see e.g. Nielsen et aL, Science, 254, 1497 (1991)) which render the base sequence less 
susceptible to enzymatic degradation and hence increases overall stability of the solid 
support-bound capture base sequence. With appropriate bonds, L-L\ a cleavage can be 
obtained directly with a laser tuned to the energy necessary for bond cleavage. Thus, the 
immobilized nested Sanger fragments can be directly ablated during mass spectrometric 
analysis. 

Prior to mass spectrometric analysis, it may be useful to "condition" 
nucleic acid molecules, for example to decrease the laser energy.re^ired for volatization, 
to minimize fragmentation or to otherwise increase the sensitivity of mass spectrometeric 
detection. For example, nucleic acids can be "conditioned" by adding positive or 
negative charges, i.e. charge tags (CTs). CTs increase the mass spectrometer detection 
sensitivity by increasing the degree of ionization during the mass spectrometric 
(e.g.MALDI) process. A CT can be linked either to the external 3* or 5' position or 
internally e.g. at the X position or at the base, e.g. at C-5 in uracil, C-5 methylgroup of 
thymine, C-5 at cytosine, at C 7 or C* of guanine, adenine and hypoxanthine or at the 
phosphate ester moiety. Charge tags, CTs, can function molecules with permanent (i.e 
pH-independent) ionization, such as: 

MfiL 
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or molecules which generate a positive charge upon MALDI and which are stabilized by 
delocaUzation of the positive charge by mesomeric effects in unsaturated and/or aromatic 
systems such as: fr^V*^ 

oi. 



wherein, 



R, R\ R 2 = H,OAI (wherein Al= e.g. lower alkyl, methyl, 
ethyl, propyl), N0 2 , CN, COJEi, C0 2 active ester, or 
halogen; and 

X = -O, -NH-, -S-, C=0, OCO either in the para or meta 
position. 



For example, the positive charge of a trityl cation is produced during MALDI by the 
15 removal of a moiety such as: -OR, where R * a lower alkyl, or an anion such as C10 4 ", 
SbF 6 \ BF 4 * and the like. 



6> 



In an alternative scheme, the trityl group is used to anchor the 
oligonucleotide to a solid support via the tertiary carbon and this bond is cleaved during 
20 mass spectrometry (e.g. MALDI), leaving a positive charge on the desorbing and high 
vacuum flying oligonucleotide. ~ r! =4i&} 

25 One of skill in the art can readily appreciate several variations to the schemes described 
above. In addition to employing the charge tag array alone, one of skill in the art can 
employ a charge tag array in conjunction with another conditioning means. Particularly 
preferred means to be used in conjunction with the CT include treating the 
phosphodiester bond with trialkylsilyl halides or the phosphomonothiodiester bond with 
30 alkyliodides to render the polyanionic backbone neutral. 
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Another example of conditioning is modification of the phosphodiester 



backbone of the nucleic acid molecule (e.g. cation exchange), which can be useful for 
eliminating peak broadening due to a heterogeneity in the cations bound per nucleotide 
unit. In addition, a nucleic acid molecule can be contacted with an alkylating agent such 
as alkyliodide, iodoacetamide, p-iodoethanol, or2,3-epoxy-l-propanol, the monothio 
phosphodiester bonds of a nucleic acid molecule can be transformed into a 
phosphotriester bond. Likewise, phosphodiester bonds may be transformed to uncharged 
derivatives employing trialkylsilyl chlorides. Further conditioning involves incorporating 
nucleotides which reduce sensitivity for depurination (fragmentation during MS) such as 
N7- or N9-deazapurine nucleotides, or RNA building blocks or using oligonucleotide 
triesters or incorporating phosphorothioate functions which are alkylated or employing 
oligonucleotide mimetics such as PNA. 

Modification of the phosphodiester backbone can be accomplished by, for 
example, using alpha-thio modified nucleotides for chain elongation and termination. 
With alkylating agents such as akyliodides, iodoacetamide, P-iodoethanol, 2,3-epoxy-l- 
propanol (see FIGURE 10), the monothio phosphodiester bonds of the nested Sanger 
fragments are transformed into phosphotriester bonds. Multiplexing by mass 
modification in this case is obtained by mass-modifying the nucleic acid primer (UP) or 
the nucleoside triphosphates at the sugar or the base moiety. To those skilled in the art, 
other modifications of the nested Sanger fragments can be envisioned. Incite 
embodiment of the invention, the linking chemistry allows one to cleave-effkhe so- 
purified nested DNA enzymatically, chemically or physically. By way of example, the L- 
L 1 chemistry can be of a type of disulfide bond (chemically cleavable, for example, by 
mercaptoethanol or dithioerythrol), a biotin/streptavidin system, a heterobifunctional 
derivative of a trityl ether group (Koster et aL, "A Versatile Acid-Labile Linker for 
Modification of Synthetic Biomolecules," Tetrahedron Letters 3L 7095 (1990)) which 
can be cleaved under mildly acidic conditions, a levulinyl group cleavable under almost 
neutral conditions with a hydrazinium/acetate buffer, an arginine-arginine or lysine-lysine 
bond cleavable by an endopeptidase enzyme like trypsin or a pyrophosphate bond 
cleavable by a pyrophosphatase, a photocleavable bond which can be, for example, 
physically cleaved and the like (see, e.g., FIGURE 23). Optionally, another cation 
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exchange can be performed prior to mass spectrometry analysis. In the instance that an 
enzyme-cleavable bond is utilized to immobilize the nested fragments, the enzyme used to 
cleave the bond can serve as an internal mass standard during MS analysis. 

The purification process and/or ion exchange process can be carried out by 
5 a number of other methods instead o£ or in conjunction with, immobilization on a solid 
support. For example, the base-specifically terminated products can be separated from 
^ -thfr reaetants l^r ^lysis, filtration (including ultrafiltration), and chromatography. 
Likewise, these techniques can be used to exchange the cation of the phosphate backbone 
with a counter-ion which reduces peak broadening. 

10 The base-specifically terminated fragment families can be generated by 

standard Sanger sequencing using the Large KJenow fragment of E. coli DNA 
polymerase I, by Sequenase, Taq DNA polymerase and other DNA polymerases suitable 
for this purpose, thus generating nested DNA fragments for the mass spectrometric 
analysis. It is, however, part of this invention that base-specifically terminated RN A 

1 S transcripts of the DNA fragments to be sequenced can also be utilized for mass 

spectrometric sequence determination. In this case, various RNA polymerases such as 
the SP6 or the T7 RNA polymerase can be used on appropriate vectors containing, for 
example, the SP6 or the T7 promoters (e.g. Axelrod et al f "Transcription from 
Bacteriophage T7 and SP6 RNA Polymerase Promoters in the Presence of 3- 

20 Deoxyribonucleoside 5-triphosphate Chain Terminators," Biochemist iv 24 . 5716-23 

- (19 05)). I n this case, the unknown DNA sequence fragments are inserted downstream 
fronmsch promoters. Transcription can also be initiated by a nucleic add primer (Pitulle 
et aL, "Initiator Oligonucleotides for the Combination of Chemical and Enzymatic RNA 
Synthesis," Gene 1 12 . 101-105 (1992)) which carries, as one embodiment of this 

25 invention, appropriate linking functionalities, L, which allow the immobilization of the 
nested RNA fragments, as outlined above, prior to mass spectrometric analysis for 
purification and/or appropriate modification and/or conditioning. 

For this immobilization process of the DNA/RN A sequencing products for 
mass spectrometric analysis, various solid supports can be used, e.g., beads (silica gel, 

30 controlled pore glass, magnetic beads, Sephadex/Sepharose beads, cellulose beads, etc.), 
capillaries, glass fiber filters, glass surfaces, metal surfaces or plastic material. Examples 
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of useful plastic materials include membranes in filter or microtiter plate formats, the 
latter allowing the automation of the purification process by employing microtiter plates 
which, as one embodiment of the invention, carry a permeable membrane in the bottom of 
the well fimctionalized with L\ Membranes can be based on polyethylene, polypropylene, 
polyamide, polyvinylidenedifluoride and the like. Examples of suitable metal surfaces 
include steel, gold, silver, aluminum, and copper. After purification, cation exchange, 
and/or modification of the phosphodiester backbone of the L-U bound nested Sanger 
fragments, they can be cleaved off the solid support chemically, enzymatically or 
physically. Also, the L-U bound fragments can be cleaved from the support when they 
are subjected to mass spectrometry analysis by using appropriately chosen L-L* linkages 
and corresponding laser energies/intensities as described above and in FIGURES 19-23 

The highly purified, four base-specifically terminated DNA or RNA 
fragment families are then analyzed with regard to their fragment lengths via 
determination of their respective molecular weights by MALDI or ES mass spectrometry. 

For ES, the samples, dissolved in water or in a volatile buffer, are injected 
either continuously or discontinuously into an atmospheric pressure ionization interface 
(API) and then mass analyzed by a quadrupole. With the aid of a computer program, the 
molecular weight peaks are searched for the known molecular weight of the nucleic acid 
primer (UP) and determined which of the four chain-terminating nucleotides has been 
added to the UP. This represents the first nucleotide of the unknown sequence. Then, 
the second, the third, the n extension product can be identified in a similar manner and, 
by this, the nucleotide sequence is assigned. The.generation of multiple ion peaks which 
can be obtained using ES mass spectrometry can increase the accuracy of the mass 
determination. 

In MALDI mass spectrometry, various mass analyzers can be used, e.g., 
magnetic sector/magnetic deflection instruments in single or triple quadrupole mode 
(MS/MS), Fourier transform and time-of-flight (TOF) configurations as is known in the 
art of mass spectrometry. FIGURES 2A through 6 are given as an example of the data 
obtainable when sequencing a hypothetical DNA fragment of 50 nucleotides in length 
(SEQ ID NO:3) and having a molecular weight of 15,344.02 daltons. The molecular 
weights calculated for the ddT (FIGURES 2A and 2B), ddA (FIGURES 3 A and 3B), 
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ddG (FIGURES 4A and 4B) and ddC (FIGURES 5A and 5B) terminated products are 
given (corresponding to fragments of SEQ ID NO:3) and the idealized four MALDI-TOF 
mass spectra shown. All four spectra are superimposed, and from this, the DNA 
sequence can be generated. This is shown in the summarizing FIGURE 6, demonstrating 

5 how the molecular weights are correlated with the DNA sequence. MALDI-TOF spectra 
have been generated for the ddT terminated products (FIGURE 16) corresponding to 

these-shown in FIGURE 2 and these spectra have been superimposed (FIGURE 17). 

The correlation of calculated molecular weights of the ddT fragments and their 
experimentally- verified weights are shown in Table 1. Likewise, if all four chain- 

10 terminating reactions are combined and then analyzed by mass spectrometry, the 

molecular weight difference between two adjacent peaks can be used to determine the 
sequence. For the desorption/ionization process, numerous matrix/laser combinations 
can be used. > 



Correlation of calculated and experimentally verified molecular weights of the 13 DNA 

fragments of FIGURES 2 and 16. 



15 



TABLE I 



Fragment 
(n-mer) — 



calculated mass 



experimental mass 



difference 



20 



25 



30 



7-mer 
-"W-mer 
1 1-mer 

1 9- mer 

20- mer 
24-mer 
26-mer 
33-mer 
3 7-mer 
38-mer 
42-mer 
46-mer 
50-mer 



2104.45 

3011.04 

3315.24 

5771.82 

6076.02 

7311.82 

7945.22 

10112.63 

11348.43 

1 1652.62 

12872.42 

14108.22 

15344.02 



2119.9 

3026.1 

3330.1 

5788.0 

6093.8 

7374.9 

7960.9 

10125.3 

11361.4 

11670.2 

12888.3 

14125.0 

15362.6 



+ 15.4 

+15.1 

+ 14.9 

+16.2 

+17.8 

+63.1 

+15.7 

+12.7 

+13.0 

+17.6 

+15.9 

+16.8 

+ 18.6 



35 
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10 



20 



25 



In order to increase throughput to a level necessary for high volume 
genomic and cDNA sequencing projects, a further embodiment of the present invention is 
to utilize multiplex mass spectrometry to simultaneously determine more than one 
sequence. This can be achieved by several, albeit different, methodologies, the basic 
principle being the mass modification of the nucleic acid primer (UP), the chain- 
elongating and/or terminating nucleoside triphosphates, or by using mass-difFerentiated 
tag probes hybridizable to specific tag sequences. .-The term "nucleic acid primer" as used 
herein encompasses primers for both DNA and RNA Sanger sequencing. 

By way of example, FIGURE 7 presents a general formula of the nucleic 
acid primer (UP) and the tag probes (TP). The mass modifying moiety can be attached, 
for mstance, to either the 5'-end of the oligonucleotide (M \ to the nucleobase (or bases) 
(M , M ), to diphosphate backbone (M 3 ), and to the ^-position of the nucleoside 
(nucleosides) (M , M ) or/and to the terminal 3'-position (M 5 ). Primer length can vary 
between 1 and 50 nucleotides in length. For the priming of DNA Sanger sequencing, the 
1 5 primer is preferentially in the range of about 1 5 to 30 nucleotides in length. For 

artificially priming the transcription in a RNA polymerase-mediated Sanger sequencing 
reaction, the length of the primer is preferentially in the range of about 2 to 6 nucleotides. 
If a tag probe (TP) is to hybridize to the integrated tag sequence of a family chain- 
terminated fi-agments, its preferential length is about 20 nucleotides. 

The table in FIGURE 7 depicts some examples of mass-modified primer/tag 
probe configurations for DNA, as well as RNA^Sanger sequencing. This list is. however, 
not meant to be hmhing, since numerous omer^sambinations of mass-modifying functions 
and positions within the oligonucleotide molecule are possible and are deemed pan of the 
invention. The mass-modifying functionality cjuU>e, for example, a halogen, an azido, or 
of the type, XR, wherein X is a linking group and R is a rnass-modifying functionality. 
The mass-modifying functionality can thus be used to introduce defined mass increments 
into the oligonucleotide molecule. 

In another embodiment, the nucleotides used for chain-elongation and/or 
termination are mass-modified. Examples of such modified nucleotides are shown in 
30 FIGURE 8. Here the mass-modifying moiety, M, can be attached either to the 

nucleobase, M 2 On case of the c 7 -deazanucleosides also to C-7. M ? ), to the triphosphate 
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group at the alpha phosphate, M 3 , or to the 2-position of the sugar ring of the nucleoside 
4 6 

triphosphate, M and M . Furthermore, the mass-modifying functionality can be added 
so as to affect chain termination, such as by attaching it to the 3 -position of the sugar 
ring in the nucleoside triphosphate, M . The list in FIGURE 8 represents examples of 

5 possible configurations for generating chain-terminating nucleoside triphosphates for 
RNA or DNA Sanger sequencing. For those skilled in the art, however, it is clear that 
many other combinations can serve the purpose of the invention equally well In the 
same way, those skilled in the art will recognize that chain-elongating nucleoside 
triphosphates can also be mass-modified in a similar fashion with numerous variations and 

10 combinations in functionality and attachment positions. 

Without limiting the scope of the invention, FIGURE 9 gives a more 
detailed description of particular examples of how the mass-modification, M, can be 
introduced for X in XR as well as using oligo-/polyethylene glycol derivatives for R. The 
mass-modifying increment in this case is 44, i.e. five different mass-modified species can 

1 5 be generated by just changing m from 0 to 4 thus adding mass units of 45 (m=0), 89 
(m=l), 133 (m=2), 177 (m=3) and 221 (m=4) to the nucleic acid primer (UP), the tag 
probe (TP) or the nucleoside triphosphates respectively. The oligo/polyethylene glycols 
can also be monoalkylated by a lower alkyl such as methyl, ethyl, propyl, isopropyl, t- 
butyl and the like. A selection of linking functionalities, X, are also illustrated. Other 

20 chemistries can be used in the mass-modified compounds, as for example, those described 
recently in Oligonucleot ides and Analogues A Practical Approach. F. Eckstein, editorr 

IRL Press, Oxford, 1991 . 

In yet another embodiment, various mass-modifying functionalities, R, other 
than oligo/polyethylene glycols, can be selected and attached via appropriate linking 

25 chemistries, X. Without any limitation, some examples are given in FIGURE 10. A 

simple mass-modification can be achieved by substituting H for halogens like F, CI, Br 
and/or I, or pseudohalogens such as SCN, NCS, or by using different alkyl, aryl or aralkyl 
moieties such as methyl, ethyl, propyl, isopropyl, t-butyl, hexyl, phenyl, substituted 
phenyl, benzyl, or functional groups such as CH 2 F, CHF 2 , CF 3 , Si(CH 3 ) 3 , 

30 Si(CH 3 ) 2 (C2H 5 ) f Si(CH 3 XC 2 H 5 ) 2 , Si(C 2 H 5 )3 . Yet another mass-modification can be 
obtained by attaching homo- or heteropeptides through X to the UP, TP or nucleoside 
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triphosphates. One example useful in generating mass-modified species with a mass 
increment of 57 is the attachment of oligogh/cines, e.g., mass-modifications of 74 (r=l, 
m=0), 13 1 (r=i, m=2). 188 (r=l, m=3), 245 (r=l, m=4) are achieved. Simple 
oligoamides also can be used, e.g., mass-modifications of 74 (r=l, m=0), 88 (r=2, m=0), 
5 102 (r=3, m=0), 1 16 (r=4, m=0), etc. are obtainable. For those skilled in the art, it will 
be obvious that there are numerous possibilities in addition to those given in FIGURE 10 
and the above mentioned reference (Oligonucleotide, „nH Analogues F. Eckstein, 1991), 
for introducing, in a predetermined manner, many different mass-modifying functionalities 
to UP, TP and nucleoside triphosphates which are acceptable for DNA and RNA Sanger 
10 sequencing. 

As used herein, the superscript 0-i designates i + 1 mass differentiated 
nucleotides, primers or tags. In some instances, the superscript 0 (e.g., NTP°, UP°) can 
designate ^unmodified species of a particular reactant. and the superscript i (e.g., NTP', 
NTP , NTP , etc.) can designate the i-th mass-modified species of that reactant. If, for 
1 5 example, more than one species of nucleic acids (e.g., DNA clones) are to be 

concurrently sequenced by multiplex DNA sequencing, then i + 1 different mass-modified 
nucleic acid primers (UP , UP 1 '. ..UP 1 ) can be used to distinguish each set of base- 
specifically terminated fragments, wherein each species of mass-modified UP* can be 
distinguished by mass spectrometry from the rest. 

As illustrative embodiments of this invention, three different basic processes 
for multiplex mass spectrometric DNA sequencing employing the described mass- 
modified reagents are described below: 

A) Multiplexing by the use of mass-modified nucleic acid primers 
(UP) for Sanger DNA or RNA sequencing (see for example FIGURE 1 1); 
25 B > Multiplexing by the use of mass-modified nucleoside 

triphosphates as chain elongators and/or chain terminators for Sanger 
DNA or RNA sequencing (see for example FIGURE 12); and 

C) Multiplexing by the use of tag probes which specifically 
hybridize to tag sequences which are integrated into part of the four 
Sanger DNA/RNA base-specifically terminated fragment families. Mass 
modification here can be achieved as described for FIGURES 7. 9 and 10, 



20 



30 
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or alternately, by designing different oligonucleotide sequences having the 

same or different length with unmodified nucleotides which, in a 

predetermined way, generate appropriately differentiated molecular 

weights (see for example FIGURE 13). 

5 The process of multiplexing by mass-modified nucleic acid primers (UP) is 

illustrated by way of example in FIGURE 1 1 for mass analyzing four different DNA 

clones simultaneously. The first reaction mixture is obtained by standard Sanger-DNA 

sequencing having unknown DNA fragment 1 (clone 1 ) integrated in an appropriate 

vector (e.g., M13mpl8), employing an unmodified nucleic acid primer UP°, and a 

10 standard mixture of the four unmodified deoxynucleoside triphosphates, dNTP°, and 

with 1/1 0th of one of the four dideoxy nucleoside triphosphates, ddNTP°. A second 

reaction mixture for DNA fragment 2 (clone 2) is obtained by employing a mass-modified 

nucleic acid primer UP and, as before, the four unmodified nucleoside triphosphates, 
0 

dNTP , containing in each separate Sanger reaction 1/1 0th of the chain-terminating 

0 

1 5 unmodified dideoxynucleoside triphosphates ddNTP . In the other two experiments, the 
four Sanger reactions have the following compositions: DNA fragment 3 (clone 3), UP^, 
dNTP°, ddNTP° and DNA fragment 4 (clone 4), UP 3 , dNTP° ddNTP° For mass 
spectrometric DNA sequencing, all base-specifically terminated reactions of the four 
clones are pooled and mass analyzed. The various mass peaks belonging to the four 

20 dideoxy-terminated (e.g., ddT-terminated) fragment families are assigned to specifically 

elongated and ddT-terminated fragments by searching (such as by a computer program) 

0 12 3 

«*— for the known molecular ion peaks of UP , UP , UP and UP extended by either one of 

0 0 1 0 2 0 3 
the four dideoxynucleoside triphosphates, UP -ddN , UP -ddN , UP -ddN and UP - 

ddN° In this way, the first nucleotides of the four unknown DNA sequences of clone 1 

25 to 4 are determined. The process is repeated, having memorized the molecular masses of 

the four specific first extension products, until the four sequences are assigned. 

Unambiguous mass/sequence assignments are possible even in the worst case scenario in 

which the four mass-modified nucleic acid primers are extended by the same 

0 

dideoxynucleoside triphosphate, the extension products then being, for example, UP - 

12 3 
30 ddT, UP -ddT, UP -ddT and UP -ddT, which differ by the known mass increment 

differentiating the four nucleic acid primers. In another embodiment of this invention, an 
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analogous technique is employed using different vectors containing, for example, the SP6 

and/or T7 promoter sequences, and performing transcription with the nucleic acid 

0 12 3 
primers UP , UP , UP and UP and either an RNA polymerase (e.g., SP6 or T7 RNA 

polymerase) with chain-elongating and terminating unmodified nucleoside triphosphates 

0 0 . 

5 NTP and 3 '-dNTP Here, the DNA sequence is being determined by Sanger RNA 
sequencing. 

_ EIGURE 12 illustrates the process .of multiplexing by mass-modified chain- 
elongating or/and terminating nucleoside triphosphates in which three different DNA 
fragments (3 clones) are mass analyzed simultaneously. The first DNA Sanger 
10 sequencing reaction (DNA fragment 1, clone 1) is the standard mixture employing 

unmodified nucleic acid primer UP°, dNTP° and in each of the four reactions one of the 
0 

four ddNTP . The second (DNA fragment 2, clone 2) and the third (DNA fragment 3, 
clone 3) have the following contents: UP°, dNTP°, ddNTP 1 and UP°, dNTP°, ddNTP 2 



, respectively. In a variation of this process, an amplification of the mass increment 



in 



1 5 mass-modifying the extended DNA fragments can be achieved by either using an equally 

1 2 

mass-modified deoxynucleoside triphosphate (i.e., dNTP , dNTP ) for chain elongation 

alone or in conjunction with the homologous equally mass-modified dideoxynucieoside 

triphosphate. For the three clones depicted above, the contents of the reaction mixtures 

can be as follows: either UP°/dNTP°/ddNTP 0 , UP^dNTpVddNTP 0 and 

20 UP°/dNTP 2 /ddNTP° or UP°/dNTP°/ddNTP 0 , UP°/dNTP 1 /ddNTP 1 and 
0 2 2 

UP /dNTP /ddNTP . As described above, DNA sequencing can be performed by 
Sanger RNA sequencing employugainmodified nucleic acid primers, UP°, and an 
appropriate mixture of chain-elongating and terminating nucleoside triphosphates. The 
mass-modification can be again either in the chain-terminating nucleoside triphosphate 

25 alone or in conjunction with mass-modified chain-elongating nucleoside triphosphates. 
Multiplexing is achieved by pooling the three base-specifically terminated sequencing 
reactions (e.g., the ddTTP terminated products) and simultaneously analyzing the pooled 
products by mass spectrometry. Again, the first extension products of the known nucleic 
acid primer sequence are assigned, e.g., via a computer program. Mass/sequence 

30 assignments are possible even in the worst case in which the nucleic acid primer is 

extended/terminated by the same nucleotide, e.g., ddT, in all three clones. The following 
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configurations thus obtained can be well differentiated by their different mass- 
modifications: UP°-ddT°, UP°-ddT l , UP°-ddT 2 . 

In yet another embodiment of this invention, DNA sequencing by multiplex 
mass spectrometry can be achieved by cloning the DNA fragments to be sequenced in 
"plex-vectors" containing vector specific "tag sequences" as described (Koster et aL f 
"Oligonucleotide Synthesis and Multiplex DNA Sequencing Using Chemiluminescent 
Detection," Nucleic Acids Res Symposium Ser. No. 24, 318-321 (1991)); then pooling 
clones from different plex-vectors for DNA preparation and the four separate Sanger 
sequencing reactions using standard dNTP°/ddNTP° and nucleic acid primer UP°; 
purifying the four multiplex fragment families via linking to a solid support through the 
linking group, L, at the 5'-end of UP; washing out all by-products, and cleaving the 
purified multiplex DNA fragments off the support or using the L-L* bound nested Sanger 
fragments as such for mass spectrometry analysis as described above; performing de- 
multiplexing by one-by-one hybridization of specific "tag probes"; and subsequently 
analyzing by mass spectrometry (see, for example, FIGURE 13). As a reference point, 
the four base-specifically terminated multiplex DNA fragment families are run by the 
mass spectrometer and all ddT -, ddA°-, ddC°- and ddG°-terminated molecular ion 
peaks are respectively detected and memorized. Assignment of, for example, ddT°- 
terminated DNA fragments to a specific fragment family is accomplished by another mass 
spectrometry analysis after hybridization of the specific tag probe (TP) to the 
corresponding tag sequence contained in the sequence of this specific fragment family. 
Only those molecular ion peaks which are capable of hybridizing to the specific tag probe 
are shifted to a higher molecular mass by the same known mass increment (e.g. of the tag 
probe). These shifted ion peaks, by virtue of all hybridizing to a specific tag probe, 
belong to the same fragment family. For a given fragment family, this is repeated for the 
remaining chain terminated fragment families with the same tag probe to assign the 
complete DNA sequence. This process is repeated i-1 times corresponding to i clones 
multiplexed (the 

i-th clone is identified by default). 

The differentiation of the tag probes for the different multiplexed clones can 
be obtained just by the DNA sequence and its ability to Watson-Crick base pair to the tag 
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sequence. It is well known in the art how to calculate stringency conditions to provide 

for specific hybridization of a given tag probe with a given tag sequence (see, for 

example, Molecular Cloning; A laboratory manual 2ed, ed. by Sambrook, Fritsch and 

Maniatis (Cold Spring Harbor Laboratory Press: NY, 1989, Chapter 11). Furthermore, 

5 differentiation can be obtained by designing the tag sequence for each plex-vector to have 

a sufficient mass difference so as to be unique just by changing the length or base 

composition or by mass-modifications according to FIGURES 7, 9 and 10. In order to 

keep the duplex between the tag sequence and the tag probe intact during mass 

spectrometry analysis, it is another embodiment of the invention to provide for a covalent 

10 attachment mediated by, for example, photoreactive groups such as psoralen and 

eMipticine and by other methods known to those skilled in the art (see, for example, 

Helene et aL, Nature 344. 358 (1990) and Thuong et ai -Oligonucleotides Attached to 

Intercalators, Photoreactive and Cleavage Agents" in F. Eckstein, Oligonucleotides and 

Analogues A Practical Approach. IRL Press, Oxford 1991, 283-306). 

1 5 The DNA sequence is unraveled again by searching for the lowest 

0 

molecular weight molecular ion peak corresponding to the known UP -tag sequence/tag 

0 

probe molecular weight plus the first extension product, e.g., ddT , then the second, the 
third, etc. 

In a combination of the latter approach with the previously described 
20 multiplexing processes, a further increase in multiplexing can be achieved by using, in 

addition to the tag probe/tag sequence interaction, mass -m odifi ed nucleic acid primers 

0-i 

(FIGURE 7) and/or mass-modified deoxynucleoside, dNTIL. ' and/or dideoxynucieoside 
0-i 

triphosphates, ddNTP . Those skilled in the an will realize that the tag sequence/tag 

probe multiplexing approach is not limited to Sanger DNA sequencing generating nested 

25 DNA fragments with DNA polymerases. The DNA sequence can also be determined by 

transcribing the unknown DNA sequence from appropriate promoter-containing vectors 

0-i 0-i 

(see above) with various RNA polymerases and mixtures of NTP /3'-dNTP , thus 
generating nested RNA fragments. 

In yet another embodiment of this invention, the mass-modifying 
30 functionality can be introduced by a two or multiple step process. In this case, the 

nucleic acid primer, the chain-elongating or terminating nucleoside triphosphates and/or 
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the tag probes are, in a first step, modified by a precursor functionality such as azido, - 
N 3 , or modified with a functional group in which the R in XR is H (FIGURE 7, 9) thus 
providing temporary functions, e.g., but not limited to -OR -NH 2 , -NHR, -SH, -NCS, 
-OCO(CH 2 ) r COOH (r = 1-20), -NHCO(CH 2 ) r COOH (r - 1-20), -OS0 2 OH, 
-OCCKCH^ (r = 1-20), -OP(0-Alkyl)N(Alkyl) 2 These less bulky functionalities result 
in better substrate properties for the enzymatic DNA or RNA synthesis reactions of the 
— BNA-sequencing process. The appropriate mass-modifying functionality is then 

introduced after the generation of the nested base-specifically terminated DNA or RNA 
fragments prior to mass spectrometry. Several examples of compounds which can serve 
as mass-modifying functionalities are depicted in FIGURES 9 and 10 without limiting the 
scope of this invention. 

Another aspect of this invention concerns kits for sequencing nucleic acids 
by mass spectrometry which include combinations of the above-described sequencing 
reactants. For instance, in one embodiment, the kit comprises reactants for multiplex 
mass spectrometric sequencing of several different species of nucleic acid. The kit can 
include a solid support having a linking functionality (L ) for immobilization of the base- 
specifically terminated products; at least one nucleic acid primer having a linking group 
■ (L) for reversibly and temporarily linking the primer and solid support through, for 
example, a photocleavable bond; a set of chain-elongating nucleotides (e.g., dATP, 
d(3TP, dGTP and dTTP, or ATP, CTP, GTP and UTP); a set of chain-terminating 
nucleotides (such as^i^dideoxynucieotides for DNA synthesis or 3 -deoxynucleotides 
for RNA synthesis);~flfft an appropriate polymerase for synthesizing complementary 
nucleotides. Primers and/or terminating nucleotides can be mass-modified so that the 
base^specifically terminated fragments generated from one of the species of nucleic acids 
to be sequenced can be distinguished by mass spectrometry from all of the others- 
Alternative to the use of mass-modified synthesis reactants, a set of tag probes (as 
described above) can be included in the kit. The kit can also include appropriate buffers 
as well as instructions for performing multiplex mass spectrometry to concurrently 
sequence multiple species of nucleic acids. 

In another embodiment, a nucleic acid sequencing kit can comprise a solid 
support as described above, a primer for initiating synthesis of complementary nucleic 
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acid fragments, a set of chain-elongating nucleotides and an appropriate polymerase. The 
mass-modified chain-terminating nucleotides are selected so that the addition of one of 
the chain terminators to a growing complementary nucleic acid can be distinguished by 
mass spectrometry. 

The present invention is further illustrated by the following examples which 
should not be construed as limiting in any way. The contents of all cited references 
(including literature references, issued pa t e nts ^published patent applications (including 
international patent application Publication Number WO 94/16101, entitled "DNA 
Sequencing by Mass Spectrometry" by H. Koester, and international patent application 
Publication Number WO 94/21822 entitled "DNA Sequencing by Mass Spectrometry Via 
Exonuclease Degradation" by H. Koester), and co-pending patent applications, (including 
U.S Patent Application Serial No. 08/406,199, entitled "DNA Diagnostics Based on 
Mass Spectrometry" by H. Koester), as cited throughout this application are hereby 
expressly incorporated by reference. 

EXAMPLE 1 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometric analysis via disulfide bonds. 

As a solid support, Sequelon membranes (Millipore Corp., Bedford. MA) 

with phenyl isothiocyanate groups are used as a starting materi al, jil he membrane disks, 

with a diameter of 8 nun, are wetted with a solution of N-methyta*wpholine/water/2- 

propanol (NMM solution) (2/49/49 v/v/v), the excess liquid removed with filter paper 

and placed on a piece of plastic film or aluminum foil located on a-heating block set to 
o 

55 C A solution of 1 mM 2-mercaptoethylamine (cysteamine) or 2, 2'-dithio- 
bis(ethylamine) (cystamine) or S^2-thiopyridyl)-2-thio-ethyiamine (10 ul, 10 nmol) in 
NMM is added per disk and heated at 55°C. After 15 min, 10 ul of NMM solution are 
added per disk and heated for another 5 min. Excess of isothiocyanate groups may be 
removed by treatment with 10 ul of a 10 mM solution of glycine in NMM solution. For 
cystamine, the disks are treated with 10 ul of a solution of 1M aqueous dithiothreitol 
(DTT)/2-propanol (1:1 v/v) for 1 5 min at room temperature. Then, the disks are 
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tho roughly washed in a filtration manifold with 5 aliquot* of 1 ml each of the NMM 
solution, then with 5 aliquots of 1 ml acetonitrile/ water v/v) and subsequently dried. 
If not used immediately the disks are stored with free thiol groups in a solution of 1M 
aqueous dithiothreitol/2-propanol (1:1 v/v) and, before use, DTT is removed by three 
5 washings with 1 ml each of the NMM solution. The primer oligonucleotides with 5-SH 
functionality can be prepared by various methods (e.g., B.CF Chu et al. % Nucleic 
Acids Pres. H, S59 1-5603 (1986), Sproat eiaL t Nucleic Acids Res I S 4837-48 (1987) 
and Oligonucleotides and Analogues: A Practical Approach fF Eckstein, editor), JRL 
Press Oxford, 1991). Sequencing reactions according to the Sanger protocol are 

1 0 performed in a standard way (e.g., H. Swerdlow et at. , Nucleic Acids Res lg, 1 4 1 5- 1 9 
(1990)). In the presence of about 7-10 mM DTT the free S'-thiol primer can be used; in 
other cases, the SH functionality can be protected, e.g., by a trityl group during the 
Sanger sequencing reactions and removed prior to anchoring to the support in the 
following way. The four sequencing reactions (1 50 ul each in an Eppendorf tube) are 

15 terminated by a 10 nun incubation at 70°C to denature the DNA polymerase (such as 
Klenow fragment, Sequenase) and the reaction mixtures are ethanol precipitated. The 
supernatants are removed and the pellets vortexed with 25 ul of an 1M aqueous silver 
nitrate solution, and after one hour at room temperature, 50 ul of an 1 M aqueous 
solution of DTT is added and mixed by vortexing. After 15 min, the mixtures are 

20 centrifuged and the pellets are washed twice with 100 ul ethylacetate by vortexing and 
centrifugation to remove eaewsJDTT. The primer extension products with free 5*-thiol 
group are now coupled to«tferthiolated membrane supports under mild oxidizing 
conditions. In general, it is sufficient to add the 5-thiolated primer extension products 
dissolved in 10 ul 10 mM der- aerated triethylammonium acetate buffer (TEAA) pH 7.2 to 

25 the thiolated membrane supports. Coupling is achieved by drying the samples onto the 
membrane disks with a cold fan. This process can be repeated by wetting the membrane 
with 10 ul of 10 mM TEAA buffer pH 7.2 and drying as before. When using the 2- 
thiopyridyl derivatized compounds, anchoring can be monitored by the release of 
pyridine-2-thione spectrophotometrically at 343 nm. 

30 In another variation of this approach, the oligonucleotide primer is 

functionalized with an amino group at the 5*-end which is introduced by standard 
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procedures during automated DNA synthesis. After primer extension, during the Sanger 

sequencing process, the primary amino group is reacted with 3-<2-pyridyldithio) 

propionic acid N-hydroxysuccinimide ester (SPDP) and subsequently coupled to the 

thiolated supports and monitored by the release of pyridyl-2-thione as described above. 

After denaturation of DNA polymerase and ethanol precipitation of the sequencing 

products, the supematants are removed and the pellets dissolved in 10 ul 10 mM TEAA 

buffer pH 7.2 and 10 ul of a 2 mM solution of SPDP in 10 mM TEAA-are added. The 

o 

reaction mixture is vortexed and incubated for 30 min at 25 C. Excess SPDP is then 
removed by three extractions (vortexing, centrifiigation) with SO ul each of ethanol and 
the resulting pellets are dissolved in 10 ul 10 mM TEAA buffer pH 7.2 and coupled to 
the thiolated supports (see above). 

The primer-extension products are purified by washing the membrane disks 
three times each with 100 ul NMM solution and three times with 100 ul each of 1 0 mM 
TEAA buffer pH 7.2. The purified primer-extension products are released by three 
successive treatments with 10 ul of 10 mM 2-mercaptoethanol in 10 mM TEAA buffer 
pH 7.2, lyophiiized and analyzed by either ES or MALDI mass spectrometry. 

This procedure can also be used for the mass-modified nucleic acid primers 

0-i 

UP in an analogous and appropriate way, talcing into account the chemical properties 
of the mass-modifying functionalities. 

EXAMPLE 2 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometric analysis via the levulinyl group 

5 - Aminolevulinic acid is protected at the primary amino group with the 
Fmoc group using 9-fluorenylmethyl N-succinimidyl carbonate and is then transformed 
into the N-hydroxysuccinimide ester (NHS ester) using N-hydroxysuccinimide and 
dicyclohexyl carbodiimide under standard conditions. For the Sanger sequencing 
reactions, nucleic acid primers, UP \ are used which are functionaiized with a primary 
amino group at the S'-end introduced by standard procedures during automated DNA 
synthesis with aminolinker phosphoamidites as the final synthetic step. Sanger 
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sequencing is performed under standard conditions (see above). The four reaction 
mixtures (150 ul each in an Eppendorf tube) are heated to 70°C for 10 min to inactivate 
the DNA polymerase, ethanol precipitated, centrifuged and resuspended in 10 ul of 10 
mM TEAA buffer pH 7.2. 10 ul of a 2 mM solution of the Fmoc-5-aminolevulinyI-NHS 
ester in 10 mM TEAA buffer is added, vortexed and incubated at 25°C for 30 min The 
excess of the reagent is removed by ethanol precipitation and centrifugation. The Fmoc 
group is cleaved off by^esuspending the pellets in 10 ul of a solution of 20% piperidine ir 
N,lsl-dimethylformamide/water (1:1 v/v). After 15 min at 25°C, piperidine is thoroughly 
removed by three precipitations/centrifugations with 100 ul each of ethanol, the pellets 
are resuspended in 10 ul of a solution of N-methylmorpholine, 2-propanol and water 
(2/10/88 v/v/v) and are coupled to the solid support carrying an isothiocyanate group. In 
the case of the DITC-Sequelon membrane (Millipore Corp., Bedford, MA), the 
membranes are prepared as described in EXAMPLE 1 and coupling is achieved on a 
heating block at 55°C as described above. RNA extension products are immobilized in 
an analogous way. The procedure can be applied to other solid supports with 
isothiocyanate groups in a similar manner. 

The immobilized primer-extension products are extensively washed three 
times with 100 ul each of NMM solution and three times with 100 ul 10 mM TEAA 
buffer pH 7.2. The purified primer-extension products are released by three successive 
treatments with 10 ul of 100 mM hydrazinium acetate buffer pH 6.5, lyophilized and 
analyzed by either ES or MALDI mass spectrometry. 

EXAMPLE 3 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometric analysis via a trypsin sensitive linkage 

Sequelon DITC membrane disks of 8 mm diameter (Millipore Corp., 
Bedford, MA) are wetted with 10 ul of NMM solution (N-methylmorpholine/propanaol- 
2/water; 2/49/49 v/v/v) and a linker arm introduced by reaction with 10 ul of a 10 mM 
solution of 1,6-diaminohexane in NMM. The excess diamine is removed by three 
washing steps with 100 ul of NMM solution. Using standard peptide synthesis protocols. 
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two L-lysine residues are attached by two successive condensations with N-Fmoc-N- 
tBoc-L-lysine pentafluorophenylester, the terminal Fmoc group is removed with 
piperidine in NMM and the free cc-amino group coupled to 1,4-phenylene 
diisothiocyanate (DITC). Excess DITC is removed by three washing steps with 100 ul 2- 
propanoi and the N-tBoc groups removed with trifluoroacetic acid according to standard 
peptide synthesis procedures. The nucleic acid primer-extension products are prepared 
from oligonucleotides which carry a primary amino group at the 5'-terminus The four 
Sanger DNA sequencing reaction mixtures (150 ul each in Eppendorf tubes) are heated 
for 10 min at 70°C to inactivate the DNA polymerase, ethanol precipitated, and the 
pellets resuspended in 10 ul of a solution of N-methylmorpholine, 2-propanol and water 
(2/10/88 v/v/v). This solution is transferred to the Lys-Lys-DITC membrane disks and 
coupled on a heating block set at 55°C. After drying, 10 ul of NMM solution is added 
and the drying process repeated. 

The immobilized primer-extension products are extensively washed three 
times with 100 ul each of NMM solution and three times with 100 ul each of 10 mM 
TEAA buffer pH 7.2. For mass spectrometric analysis, the bond between the primer- 
extension products and the solid support is cleaved by treatment with trypsin under 
standard conditions and the released products analyzed by either ES or MALDI mass 
spectrometry with trypsin serving as an internal mass standard. 

EXAMPLE 4 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometric analysis via pyrophosphate linkage 

The DITC Sequelon membrane (disks of 8 mm diameter) are prepared as 
described in EXAMPLE 3 and 10 ul of a 10 mM solution of 3-aminopyridine adenine 
dinucleotide (APAD) (Sigma) in NMM solution added. The excess APAD is removed by 
a 10 ul wash of NMM solution and the disks are treated with 10 ul of 10 mM sodium 
periodate in NMM solution (IS min, 2S°C). Excess periodate is removed and the 
primer-extension products of the four Sanger DNA sequencing reactions (1 50 ul each in 
Eppendorf tubes) employing nucleic acid primers with a primary amino group at the 5*- 
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end are ethanol precipitated, dissolved in 10 ul of a solution of N-methyimorpholine/2- 
propanol/water (2/10/88 v/v/v) and coupled to the T 3'-dialdehydo groups of the 
immobilized NAD analog. 

The primer-extension products are extensively washed with the NMM 
5 solution (3 times with 100 ul each) and 10 mM TEAA buffer pH 7.2 (3 times with 100 ul 
each) and the purified primer-extension products are released by treatment with either 
NADase or pyrophosphatase in 10 mM TEAA buffer at pH 7.2 at 37°C for 15 min, 
lyophiiized and analyzed by either ES or MALDI mass spectrometry, the enzymes serving 
as internal mass standards. 

10 

E XAMPLE $ 

Synthesis of nucleic acid primers mass-modified by glycine residues at the 5*- 
position of the sugar moiety of the terminal nucleoside 

1 5 Oligonucleotides are synthesized by standard automated DN A synthesis 

using B-cyanoethylphosphoamidites (H. K6ster et al. % Nucleic Acids Res 12, 4539 

(1984)) and a 5'-amino group is introduced at the end of solid phase DNA synthesis (e.g. 

Agrawal et ai, Nucleic Acids Res 14, 6227-45 (1986) or Sproat et al % Nucleic Acids 

Res* JJL 6181-96 (1987)). The total amount of an oligonucleotide synthesis, starting 

20 with 0.25 umol CPG-bound nucleoside, is deprotected with concentrated aqueous 

TM 

ammonia, purified via OligoPAK Cartridges (MiHipore Corp., Bedford, MA) and 

lyophiiized. This material with a 5 -terminal amino*group is dissolved in 100 ul absolute 

N t N-dimethylformamide (DMF) and condensed with 10 ^mole N-Fmoc-glycine 

o 

pentafluorophenyl ester for 60 min at 25 C. After ethanol precipitation and 
25 centrifugation, the Fmoc group is cleaved off by a 10 min treatment with 100 ul of a 

solution of 20% piperidine in N,N-dimethylformamide. Excess piperidine, DMF and the 
cleavage product from the Fmoc group are removed by ethanol precipitation and the 
precipitate lyophiiized from 10 mM TEAA buffer pH 7.2. This material is now either 
used as primer for the Sanger DNA sequencing reactions or one or more glycine residues 
30 (or other suitable protected amino add active esters) are added to create a series of mass- 
modified primer oligonucleotides suitable for Sanger DNA or RNA sequencing. 
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Immobilization of these mass-modified nucleic acid primers UP after primer-extension 
during the sequencing process can be achieved as described, e.g., in EXAMPLES 1 to 4. 



Synthesis of nucleic acid primers mass-modified at C-5 of the heterocyclic base of a 
pyrimidine nucleoside with glycine residues 

Starting material was S-p-aminopropynyM)^ S'-di-p-tolyldeoxyuridine 
prepared and 3' 5-de-O-acylated according to literature procedures (Haralambidis et ai y 

10 Nuckic Acids Res 11> 4857-76 (1987)). 0.281 g (1.0 mmole) 5.(3-^1^™?^!-!)^'- 
deoxyuridine were reacted with 0.927 g (2.0 mmole) N-Fmooglycine 
pentafluorophenyiester in 5 ml absolute N,N-dimethylfonnamide in the presence of 0. 129 
g (1 mmole; 174 ul) N,N-dusopropylethylamine for 60 min at room temperature. 
Solvents were removed by rotary evaporation and the product was purified by silica gel 

15 chromatography (Kieselgel 60, Merck; column: 2.5x 50 cm, elution with 

chloroform/methanol mixtures). Yield was 0.44 g (0.78 mmole, 78 %). In order to add 
another glycine residue, the Fmoc group is removed with a 20 min treatment with 20% 
solution of piperidine in DMF, evaporated in vacuo and the remaining solid material 
extracted three times with 20 ml ethylacetate. After having removed the remaining 

20 ethylacetate, N-Fmoc-glycine pentafluorophenyiester is coupled as described above. 5- 
(3-(N-Fmoc-glycyl)-amidopropynyl-l)-2 , -deoxyuridine is transformed into the 5'-0- 
dimethoxytritylated nucleoside-3 , -0-B-cyanoethyl-N,N-diisopropylphosphoamidite and 
incorporated into automated oligonucleotide synthesis by standard procedures (H. Kfcster 
at, Nucleic Acids Rtt* 12, 2261 (1984)). This glycine modified thymidine analogue 

25 building block for chemical DNA synthesis can be used to substitute one or more of the 
thymidine/uridine nucleotides in the nucleic acid primer sequence. The Fmoc group is 
removed at the end of the solid phase synthesis with a 20 min treatment with a 20 % 
solution of piperidine in DMF at room temperature. DMF is removed by a washing step 
with acetonitrile and the oligonucleotide deprotected and purified in the standard way 
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Synthesis of a nucleic acid primer mass-modified at C-5 of the heterocyclic base of 
a pyrimidine nucleoside with B~alanine residues 

Starting material was the same as in EXAMPLE 6. 0.28 1 g ( 1 .0 mmole) 
5-(3-Aminopropynyl- 1 ^'-deoxyuridine was reacted with N-Fmoc-B-alanine 
pentafluorophenylester (0.955 g, 2.0 mmole) in 5 ml N,N-dimethyIformamide (DMF) in 
the presence of 0.129 g (174 ul; 1.0 mmole) RN-disopropylethylamine for 60 min at 
room temperature. Solvents were removed and the product purified by silica ge! 
chromatography as described in EXAMPLE 6. Yield was 0.425 g (0.74 mmole, 74 %). 
Another B-alanine moiety can be added in exactly the same way after removal of the 
Fmoc group. The preparation of the 5-O-dimethoxytritylated nucleoside-3 -O-B- 
cyanoethyl-N,NHdiisopropylphosphoamidite from 5-(3-(N-Fmoc-B-alanyl)- 
amidopropynyl-I)-2-deoxyuridine and incorporation into automated oligonucleotide 
synthesis is performed under standard conditions. This building block can substitute for 
any of the thymidine/uridine residues in the nucleic acid primer sequence. In the case of 
only one incorporated mass-modified nucleotide, the nucleic acid primer molecules 
prepared according to EXAMPLES 6 and 7 would have a mass difference of 14 daltons 
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EXAMPLE 8 

Synthesis of a nucleic acid primer mass-modified at C-5 of the heterocyclic base of 
a pyrimidine nucleoside with ethylene glycol monomethyl ether 

As a nucleosidic component, SKS-aminopropynyl-l^'-deoxyuridine was 
used in this example (see EXAMPLES 6 and 7). The mass-modifying functionality was 
obtained as follows: 7.61 g (100.0 mmole) freshly distilled ethylene glycol monomethyl 
ether dissolved in 50 ml absolute pyridine was reacted with 10.01 g (100.0 mmole) 
recrystallized succinic anhydride in the presence of 1 .22 g (10.0 mmole) 4-N,N- 
dimethylaminopyridine overnight at room temperature. The reaction was terminated by 
the addition of water (5 .0 ml), the reaction mixture evaporated in vacuo, co-evaporated 
twice with dry toluene (20 ml each) and the residue redissolved in 100 ml 
dichloromethane. The solution was extracted successively, twice with 10 % aqueous 
citric acid (2 x 20 ml) and once with water (20 ml) and the organic phase dried over 
anhydrous sodium sulfate. The organic phase was evaporated in vacuo, the residue 
redissolved in 50 ml dichloromethane and precipitated into 500 ml pentane and the 
precipitate dried in vacuo. Yield was 13. 12 g (74.0 mmole; 74 %). 8.86 g (50.0 mmole) 
of succinylated ethylene glycol monomethyl ether was dissolved in 100 ml dioxane 
containing 5% dry pyridine (5 ml) and 6.96 g (50.0 mmole) 4-nitrophenol and 10.32 g 
(50.0 mmole) dicyclohexylcarbodiimide was added and the reaction run at room 

- temperature for 4 hours. Dicyclohexylurea was removed by filtration, the filtrate 

- evaporated in vacuo and the residue redissolved in 50 ml anhydrous DMF. 12.5 ml 
(about 12.5 mmole 4-nitrophenylester) of this solution was used to dissolve 2.81 g (10.0 
mmole) 5-(3-aminopropynyl-l)-2 , -deoxyuridine. The reaction was performed in the 
presence of 1.01 g (10.0 mmole; 1.4 ml) triethylamine at room temperature overnight. 
The reaction mixture was evaporated in vacuo, co-evaporated with toluene, redissolved 
in dichloromethane and chromatographed on silicagel (Si60, Merck; column 4x50 cm) 
with dichloromethane/methanoi mixtures. The fractions containing the desired compound 
were collected, evaporated, redissolved in 25 ml dichloromethane and precipitated into 
250 ml pentane. The dried precipitate of 5-(3-N-(0-succinyl ethylene glycol monomethyl 
ether)-amidopropynyl-l).2 , -deoxyuridine (yield: 65 %) is S'-O-dimethoxytritylated and 




WO 97/37041 PCT7US97/04394 

-44 - 

transformed into the nucleoside-3'-0-B-cyanoethyl-N, N-diisopropylphosphoamidite and 
incorporated as a building block in the automated oligonucleotide synthesis according to 
standard procedures. The mass-modified nucleotide can substitute for one or more of the 
thymidine/uridine residues in the nucleic acid primer sequence Deprotection and 
purification of the primer oligonucleotide also follows standard procedures. 
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EXAMPLE 9 

Synthesis of a nucleic acid primer mass-modified at C-5 of the heterocyclic base of 
a pyrimidine nucleoside with diethyiene glycol monomethyl ether 

Nucleosidic starting material was as in previous examples, 5-(3- 
aminopropynyl- 1 )-2'-deoxyuridine. The mass-modifying functionality was obtained 
similar to EXAMPLE 8. 12.02 g (100.0 mmole) freshly distilled diethyiene glycol 
monomethyl ether dissolved in SO ml absolute pyridine was reacted with 10 01 g (100.0 
mmole) recrystallized succinic anhydride in the presence of 1 .22 g (10.0 mmole) 4-N, N- 
dimethylaminopyridine (DMAP) overnight at room temperature. The work-up was as 
described in EXAMPLE 8. Yield was 18 .35 g (82.3 mmole, 82.3 %). 1 1.06 g (50.0 
mmole) of succinylated diethyiene glycol monomethyl ether was transformed into the 4- 
nhrophenylester and, subsequently, 12.5 mmole was reacted with 2.8 1 g (10.0 mmole) of 
5-(3-aminopropynyl-l)-2^deoxyuridine as described in EXAMPLE 8. Yield after silica 
gel column chromatography and precipitation into pentane was 3.34 g (6.9 mmole, 69 
%). After dimethoxytritylation and transformation into the nucleoside-B- 
cyanqethylphosphoamidite, the mass-modified building block is incorpprated into 
automated chemical DNA synthesis according to standard procedures. Within the 
sequence of the nucleic acid primer UP 0 ' 1 , one or more of the thymidine/uridine residues 
can be substituted by this mass-modified nucleotide. In the case of only one incorporated 
mass-modified nucleotide, the nucleic acid primers of EXAMPLES 8 and 9 would have a 
mass difference of 44.05 daltons. 

EXAMPLE 10 

Synthesis of a nucleic acid primer mass-modified at C-8 of the heterocyclic base of 
deoxyadenosine with glycine 

Starting material was N -ben2oyl-8-bromo-5 , -0-(4,4 , -dimethoxytrityl)-2'- 
deoxyadenosine prepared according to literature (Singh et al % Nucleic Adds RfiS IS, 
3339-45 (1990)). 632.5 mg (1.0 mmole) of this 8-bromo-deoxyadenosine derivative was 
suspended in 5 ml absolute ethanol and reacted with 251 .2 mg (2.0 mmole) glycine 
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methyl ester (hydrochloride) in the presence of 241 .4 mg (2. 1 mmole; 366 ul) N, N- 
diisopropylethylamine and refluxed until the starting nucleosidic material had disappeared 
(4-6 hours) as checked by thin layer chromatography (TLC). The solvent was 
evaporated and the residue purified by silica gel chromatography (column 2.5x50 cm) 
using solvent mixtures of chloroform/methanol containing 0. 1 % pyridine. The product 
fractions were combined, the solvent evaporated, the fractions dissolved in 5 ml 
dichloromethane and precipitated into 100 ml pentane. Yield was 487 mg (0.76 mmole, 
76-%). Transformation into the corresponding nucleoside-D-cyanoethylphosphoamidite 
and integration into automated chemical DNA synthesis is performed under standard 
conditions. During final deprotection with aqueous concentrated ammonia, the methyl 
group is removed from the glycine moiety. The mass-modified building block can 
substitute one or more deoxyadenosine/adenosine residues in the nucleic acid primer 
sequence. 

EXAMPLE U 

Synthesis of a nucleic acid primer mass-modified at C-8 of the heterocyclic base of 
deoxyadenosine with giycylgtycine 

This derivative was prepared in analogy to the glycine derivative of 
EXAMPLE 10. 632.5 mg (1.0 mmole) N 6 -Benzoyl-8-bromo-5'-0-(4,4'- 
dimethoxytrityl)-2 -deoxyadenosine was suspended in 5 ml absolute ethanol and reacted 
with 324.3 mg (2.0 mmole) glycyl-glycine methyl ester in the presence of 241.4 mg (2.1 
mmole, 366 fit) 

N, N-diisopropylethylamine. The mixture was refluxed and completeness of the reaction 
checked by TLC. Work-up and purification was similar to that described in EXAMPLE 
10. Yield after silica gel column chromatography and precipitation into pentane was 464 
mg (0.65 mmole, 65 %). Transformation into the nucleoside-^- 
cyanoethylphosphoamidite and into synthetic oligonucleotides is done according to 
standard procedures. In the case where only one of the deoxyadenosine/adenosine 
residues in the nucleic acid primer is substituted by this mass-modified nucleotide, the 
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mass difference between the nucleic acid primers of EXAMPLES 10 and 1 1 is 57.03 daltons. 

EXAMPLE 13 

5 Synthesis of a nucleic acid primer mass-modified at the C-2' of the sugar moiety of 
2-araino-2'-deorythymidine with ethylene glycol monomethyl ether residues 

Starting material was S'-OKM-dinrethoxytmyO^'-amino^ 1 - 
deoxythymidine synthesized according to published procedures (e.g., Verheyden et aL. L 
0rg Cfhfm 36. 250-254 (1971); Sasaki etal., J. Org r.h*m 41 3138-3143 (1976); 
10 Imazawa etal.,], O rgl Chem M, 2039-2041 (1979); Hobbs etal.. J. Org Ch^m ao 
714-719 (1976); Ikehara et aL. Chem Pham, R„ll Tnpan ^ 240-244 (1978); see also 
PCT Application WO 88/00201). 5'-0-(4,4-Dimethoxytrityl)-2'-amino-2'- 
deoxythymidine (559.62 mg; 1.0 mmole) was reacted with 2.0 mmole of the 4- 
nitrophenyl ester of succinylated ethylene glycol monomethyl ether (see EXAMPLE 8) in 
10 ml dry DMF in the presence of 1.0 mmole (140 ul) triethylamine for 18 hours at room 
temperature. The reaction mixture was evaporated in vacuo, co-evaporated with 
toluene, redissolved in dichloromethane and purified by silica gel chromatography (Si60, 
Merck; column: 2.5x50 cm; eluent: chloroform/methanol mixtures containing 0.1 % 
triethylamine). The product containing fractions were combined, evaporated and 
20 precipitated into pentane. Yield was 524 mg (0.73 mmol; 73 %). Transformation into 

the nudeoside-l^anoethyl-N,N-dusopropylphosphoamidite and incorporation into the - 

automated chemical DNA synthesis protocol is performed by standard procedures. The 
mass-modified deoxythymidine derivative can substitute for one or more of the thymidine 
residues in the nucleic acid primer. 

25 1113,1 analogous way, by employing the 4-nitrophenyl ester of succinylated 

diethylene glycol monomethyl ether (see EXAMPLE 9) and triethylene glycol 
monomethyl ether, the corresponding mass-modified oligonucleotides are prepared. In 
the case of only one incorporated mass-modified nucleoside within the sequence, the 
mass difference between the ethylene, diethylene and triethylene glycol derivatives is 

30 44.05. 88. 1 and 1 32. 1 5 daltons respectively. 
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EXAMPLE 13 

Synthesis of a nucleic acid primer mass-modified in the internucleotidic linkage via 
alkylation of phosphorothioate groups 

Phosphorothioate-containing oligonucleotides were prepared according to 
standard procedures (see e.g. Gait et aL y Nucleic Acids Res 12 1 183 (1991)). One, 
several or all internucleotide linkages can be modified in this way. The (-)-M13 nucleic 
acifr primer sequence (17-mer) S'-dGTAAAACGACGGCCAGT was synthesized in 0.25 
pmole scale on a DNA synthesizer and one phosphorothioate group introduced after the 
final synthesis cycle (G to T coupling). Sulfiirization, deprotection and purification 
followed standard protocols. Yield was 3 1.4 nmole (12.6 % overall yield), 
corresponding to 31.4 nmole phosphorothioate groups. Alkylation was performed by 
dissolving the residue in 3 1 .4 nl TE buffer (0.01 M Tris pH 8.0, 0.001 M EDTA) and by 
adding 16 pi of a solution of 20 mM solution of 2-iodoethanol (320 nmole; i.e., 10-fold 
1 5 excess with respect to phosphorothioate diesters) in N,N-dimethylformamide (DMF). 
The alkylated oligonucleotide was purified by standard reversed phase HPLC (RP-18 
Ultraphere, Beckman; column: 4.5 x 250 mm; 100 mM triethylammonium acetate, pH 7.0 
and a gradient of 5 to 40 % acetonitrile). 

In a variation of this procedure, the nucleic acid primer containing one or 
20 more phosphorothioate phosphodiester bond is used in the Sanger sequencing reactions 
The primer«extension products pfihe^pur sequencing reactions are purified as 
exemplified in EXAMPLES 1 - 4rdeaved off the solid support, lyophilized and dissolved 
in 4 pi each of TE buffer pH 8 .0 and alkylated by addition of 2 \il of a 20 mM solution of 
2-iodoethanol in DMF. It is then analyzed by ES and/or MALDI mass spectrometry. 
25 In an analogous way, employing instead of 2-iodoethanol, e.g., 3- 

iodopropanol, 4-iodobutanol mass-modified nucleic acid primer are obtained with a mass 
difference of 14.03, 28.06 and 42.03 daltons respectively compared to the unmodified 
phosphorothioate phosphodiester-containing oligonucleotide. 

30 EXAMPIF14 
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Synthesis of 2*-amino-2'^eoxyuridine-5'-triphosphate and 3'-amino-2',3'- 
dideoxythymidine-5'-triphosphate mass-modified at the 2'- or 3-ammo function 
with glycine or B-alanine residues 

Starting material was 2 -azido-2'-deoxyuridine prepared according to 
literature (Verheyden et aL t J Qrp Them lf> 250 (1971)), which was 4,4- 
dimethoxytritylated at 5'-OH with 4,4-dimethoxytrityl chloride in pyridine and acetylated 
at 3 , -OH with acetic anhydride in a one-pot reaction using standard reaction conditions. 
With 191 nig (0.71 mmole) 2 , -azido-2 , -deoxyuridine as starting material, 396 mg (0.65 
mmol, 90.8 %) S^^^dimethoxytrityl^'-O-acetyl^'-azido^'-deoxuridine was 
obtained after purification via silica gel chromatography. Reduction of the azido group 
was performed using published conditions (Barta et ai % Tetrahedron 46 587-594 
(1990)). Yield of 5 , -0-(4,4-dimethoxytrityI)-3 , -0-acetyJ-2 , -amino-2 , -deox>airidme after 
silica gel chromatography was 288 mg (0.49 mmole; 76 %). This protected 2'-amino-2- 
deoxyuridine derivative (588 mg, 1.0 mmole) was reacted with 2 equivalents (927 mg, 
2.0 mmole) N-Fmoc-glycine pentafluorophenyl ester in 10 ml dry DMF overnight at 
room temperature in the presence of 1.0 mmole (174 nl) N,N-diisopropylethylamine. 
Solvents were removed by evaporation in vacuo and the residue purified by silica gel 
chromatography. Yield was 71 1 mg (0.71 mmole, 82 %). Detritylation was achieved by 
a one hour treatment with 80% aqueous acetic acid at room temperature. The residue 
was evaporated to dryness, co-evaporated twice with toluene, suspended in 1 ml dry 
acetonitrile and 5'-phosphorylated with P0C1 3 according to literature (Yoshikawa etal t 
Bull, Chem Soc, Japan 42, 3505 (1969) and Sowa et a/.. Bull Them Soc Japan 
2084 (1975)) and directly transformed in a one-pot reaction to the 5-triphosphate using 3 
ml of a 0.5 M solution (1.5 mmole) tetra (tri-n-butylammonium) pyrophosphate in DMF 
according to literature (e.g. Seela et aL , Helvetica Ch imica Acta 24, 1048 (1991)). The 
Fmoc and the 3'-0-acetyl groups were removed by a one-hour treatment with 
concentrated aqueous ammonia at room temperature and the reaction mixture evaporated 
and lyophilized. Purification also followed standard procedures by using anion-exchange 
chromatography on DEAE-Sephadex with a linear gradient of triethylammonium 
bicarbonate (0.1M-1.0 M). Triphosphate containing fractions (checked by thin layer 
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chromatography on polyethyleneimine cellulose plates) were collected, evaporated and 
lyophilized. Yield (by UV-absorbance of the uracil moiety) was 68% (0.48 mmole). 

A glycyl-glycine modified ^amino^-deoxyuridine-S^triphosphate was 
obtained by removing the Fmoc group from 5 , .0-(4,4.dimethoxytrityl)-3 , -0-acetyl.2 , -N- 
(N^-fluorenylmethyloxycarbonyl-glycyl^^amino^^deoxyuridine by a one-hour 
treatment with a 20% solution of piperidine in DMF at room temperature, evaporation of 
solvents, two-fold co-evaporation with toluene and subsequent condensation with N- 
Fmoc-glycine pentafluorophenyl ester. Starting with 1.0 mmole of the r-N-glycyW- 
amino-2-deoxyuridine derivative and following the procedure described above, 0.72 
mmole (72%) of the corresponding 2 , -(N-glycyl-glycyl)-2 , -amino-2 , -deoxyuridine-5 , - 
triphosphate was obtained. 

Starting with 5 , -CK4,4-dimethoxytrityl)-3 , -0-acetyl-2 , -amino-2 , - 
deoxyuridine and coupling with N-Fmoc-B-alanine pentafluorophenyl ester, the 
corresponding 2 '-(N-B-alanyI)-2 -amino-T-deoxyuridine-S-triphosphate can be 
synthesized. These modified nucleoside triphosphates are incorporated during the Sanger 
DNA sequencing process in the primer-extension products. The mass difference between 
the glycine, B-alanine and glycyl-glycine mass-modified nucleosides is, per nucleotide 
incorporated, 58.06, 72.09 and 1 15. 1 daltons respectively. 

When starting with 5 , -0-(4,4-dimethoxytrityl)-3 , -amino-2 , ,3 - 
dideoxythymidine (obtained by published procedures, see EXAMPLE 12), the 
corresponding S'-^-glycyl^'-amino-/ 3 , -(-N-glycyl-glycyl)-3 , -amino-/ and 3^-0- 
alanyl)-3 , *amino-2',3 , -dideoxythymidine-5 -triphosphates can be obtained. These mass- 
modified nucleoside triphosphates serve as a terminating nucleotide unit in the Sanger 
DNA sequencing reactions providing a mass difference per terminated fragment of 58.06, 
72.09 and 115. 1 daltons respectively when used in the multiplexing sequencing mode. 
The mass-differentiated fragments can then be analyzed by ES and/or MALDI mass 
spectrometry. 
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Synthesis of deoxyuridine-S'-triphosphate mass-modified at C-5 of the heterocyclic 
base with glycine, gtycyl-glycine and Q-alanine residues. 



EXAMPLE 6) was reacted with either 0.927 g (2.0 mmole) N-Fmoc-glycine 
pentafiuorophenylester or 0.955g (2.0 mmole) N-Fmoc-B-alanine pentafluorophenyl ester 
in 5 ml dry DMF in the presence of 0. 129 g N, N-diisopropylethyiamine (1 74 ul, 1 .0 
mmole) overnight at room temperature. Solvents were removed-by evaporation in vacuo 
and the condensation products purified by flash chromatography on silica gel (Still et al. y 



glycine and 436 mg (0.76 mmole; 76%) for the B-alanine derivatives. For the synthesis of 
the glycyl-glycine derivative, the Fmoc group of 1.0 mmole Fmoc-glycine-deoxyuridine 
derivative was removed by one-hour treatment with 20% piperidine in DMF at room 
temperature. Solvents were removed by evaporation in vacuo, the residue was co- 
evaporated twice with toluene and condensed with 0.927 g (2.0 mmole) N-Fmoc-glycine 
pentafluorophenyl ester and purified as described above. Yield was 445 mg (0.72 mmole; 
72%). The glycyl-, glycyl-glycyl- and B-alanyI-2'-deoxyuridine derivatives, N-protected 
with the Fmoc group were transformed to the 3 -O-acetyl derivatives by tritylation with 
4,4-dimethoxytrityl chloride in pyridine and acetyl at ion with acetic anhydride in pyridine 
in a one-pot reaction and subsequently detritylated by one hour treatment with 80% 
aqueous acetic acid according to standard procedures. Solvents were removed, the 
residues dissolved in 100 ml chloroform and extracted twice with 50 ml 1 0% sodium 
bicarbonate and once with 50 ml water, dried with sodium sulfate, the solvent evaporated 
and the residues purified by flash chromatography on silica gel. Yields were 361 mg 
(0.60 mmole; 71%) for the glycyl-, 351 mg (0.57 mmole; 75%) for the B-aianyl- and 323 
mg (0.49 mmole; 68%) for the glycyl-glycyl-3-O -acetyl-2'-deoxyuridine derivatives 
respectively. Phosphorylation at the 5'-OH with POCI3, transformation into the 5'- 
triphosphate by in-situ reaction with tetra(tri-n-butyi ammonium) pyrophosphate in DMF, 
3-de-O-acetyiation, cleavage of the Fmoc group, and final purification by anion-exchange 
chromatography on DEAE-Sephadex was performed as described in EXAMPLE 14. 
Yields according to UV-absorbance of the uracil moiety were 0.4 1 mmole 5-(3-(N- 
glycyO-amidopropynyl-l^'-deoxyuridine-S-triphosphate (84%), 0.43 mmole 5-(3-(N-B- 



0.281 g (1.0 mmole) S-O-Anrinopropynyl-l^'-deoxyuridine (see 




41, 2923-2925 (1978)). Yields were 476 mg (0.85 mmole: 85%) for the 
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alanyl^aradopropynyl-l^r-deoxyuridine-^^ (75%) and 0.38 mmole 5-(3- 

(N-g!ycyI-gIycyl)-amidopropynyU 1 ^'-deoxyuridine-S'-triphosphate (78%). 

These mass-modified nucleoside triphosphates were incorporated during the 
Sanger DNA sequencing primer-extension reactions. 
5 When using 5-(3-aminopropynyM>2 , ,3 , Klideoxyuridine as starting material 

and following an analogous reaction sequence the corresponding glycyl-, glycyl-glycyl- 
and B-alanyl-2 , ,3^dideoxyuridine-5 , -triphosphates were obtained in yields of 69, 63 and 
71% respectively. These mass-modified nucleoside triphosphates serve as chain- 
terminating nucleotides during the Sanger DNA sequencing reactions. The mass- 
10 modified sequencing ladders are analyzed by either ES or MALDI mass spectrometry 

EXAMPLE 16 

Synthesis of 8-glycyl- and S-glycyl-glycyl-2'^eoxyadenostne-5'-triphosphate 

15 727 mg ( 1 .0 mmole) of N 6 -(4-tert-butylphenoxyacetyl)-8-glycyl-5 , -(4,4- 

dimethoxytrityl>2'- deoxyadenosine or 800 mg (1.0 mmole) N 6 -(4-tert- 
butylphenoxyacetyl)-8-glycyl-glycyl^ prepared 
according to EXAMPLES 10 and 1 1 and literature (Koster et aL 9 Tetrahedron ?7, 362 
(1981)) were acetylated with acetic anhydride in pyridine at the 3-OH, detritylated at the 

20 5'-pbsition with 80% acetic acid in a one-pot reaction and transformed into the 5'- 
triphosphates via phosphorylation with^OGI^ and reaction in -situ with tetra(tri-n- 
butylammonium) pyrophosphate as described in EXAMPLE 14. Deprotection of the N 6 - 
tert-butylphenoxyacetyl, the 3-O-acetyl and the O-methy! group at the glycine residues 
was achieved with concentrated aqueous ammonia for ninety minutes at room 

25 temperature. Ammonia was removed by lyophilization and the residue washed with 
dichloromethane, solvent removed by evaporation in vacuo and the remaining solid 
material purified by anion-exchange chromatography on DEAE-Sephadex using a linear 
gradient of triethylammonium bicarbonate from 0. 1 to 1 .0 M. The nucleoside 
triphosphate containing fractions (checked by TLC on polyethyleneimine cellulose plates) 

30 were combined and lyophillized. Yield of the 8-glycyl-2'-deoxyadenoane-5 , -triphosphate 
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(determined by UV-absorbance of the adenine moiety) was 57% (0.57 mmole). The yield 
for the 8-glycyl-glycyl-2'-deoxyadenosine-5 -triphospnate was 51% (0.51 mmole). 

These mass-modified nucleoside triphosphates were incorporated during 
primer-extension in the Sanger DNA sequencing reactions. 
5 When using the corresponding N6-(4-tert-butylphenoxyacetyl)-8-glycyl- or 

-gly<^l-glycyl-5 , -0-(4,4-dimethoxytrityl)-2 , ( 3'-dideoxyadenosine derivatives as starting 
materials prepared according to standard procedures (see, e.g., for the introduction of the 
2\3-function: Seela era/., Helvetica Chimin Acta 24, 1 048- 1 058 (1 991)) and using an 
analogous reaction sequence as described above, the chain-terminating mass-modified 
1 0 nucleoside triphosphates 8-glycyl- and 8-glycyl-glycyI-2 , .3 , -dideoxyadenosine-5 I - 

triphosphates were obtained in 53 and 47% yields respectively. The mass-modified 
sequencing fragment ladders are analyzed by either ES or MALDI mass spectrometry. 



15 



20 



EXAMPIJT 17 

Mass-modification of Sanger DNA sequencing fragment ladders by incorporation 
of chain-elongating Z'-deoxy- and chain-terminating 2<>3<-dideoxythyinidine-5'- 
(aipha-S-)-triphosphate and subsequent alkylation with 2-iodoethanol and 3- 
iodopropanol 

2\3 , -Dideoxythymidine-5 , -(alpha-S)-triphosphate was prepared according to 
published procedures (e.g., for the alpha-S-triphosphate moiety: Eckstein etai. 
Biochemistry 15_, 1685 (1976) and Accounts C.h*m ftgg J2, 204 (1978) and for the 2',3'- 
dideoxy moiety: Seek etaL, Helvetia Phimi^ 24, 1048-1058 (1991)). Sanger 
DNA sequencing reactions employing 2 , -deoxythymidine-5'-(alpha-S)-triphosphate are 
performed according to standard protocols (e.g. Eckstein. Ann Rev Biochem M, 367 
(1985)). When using 2\3'-dideoxythymidine-5'-(alpha-S)-triphosphates. this is used 
instead of the unmodified 2\3^dideoxythymidine-5'-triphosphate in standard Sanger DNA 
sequencing (see e.g. Swerdlow et al. Nucleic Acids R~ H, 141 5- 14 19 ( 1990)). The 
template (2 pmole) and the nucleic acid Ml 3 sequencing primer (4 pmole) modified 
30 according to EXAMPLE 1 are annealed by heating to 65°C in 100 ul of 1 0 mM Tris-HCI 
pH 7.5, 10 mM MgCI 2 , 50 mM NaCl, 7 mM dhhiothreitol (DTT) for 5 min and slowly 



25 
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brought to 37°C during a one hour period. The sequencing reaction mixtures contain, as 

exemplified for the T-specific termination reaction, in a final volume of 150 ul, 200 uM 

(final concentration) each of dATP, dCTP, dTTP, 300 uM c7-deaza-dGTP, 5 uM 2\3- 

dideoxythymi dine-5 -(alpha-S)-triphosphate and 40 units Sequenase (United States 

5 Biochemicais). Polymerization is performed for 10 min at 37°C, the reaction mixture 
o 

heated to 70 C to inactivate the Sequenase, ethanol precipitated and coupled to thiolated 
Sequelon membrane disks (8 mm diameter) as described in EXAMPLE- 1 . AJkylation is 
performed by treating the disks with 10 ul of 10 mM solution of either 2-iodoethanoI or 
3-iodopropanol in NMM (N-methylmorpholine/water/2-propanol, 2/49/49, v/v/v) (three 
10 times), washing with 10 ul NMM (three times) and cleaving the alkylated T-terminated 
primer-extension products off the support by treatment with DTT as described in 
EXAMPLE 1 . Analysis of the mass-modified fragment families is performed with either 
ES or MALDI mass spectrometry 

15 EXAMPLE 1? 

Analysis of a Mixture of Oligothymidylic Acids 

Oligothymidylic acid, oligo p(dT)|2-i8. * s commercially available (United 
States Biochemical, Cleveland, OH). Generally, a matrix solution of 0.5 M in ethanol 

20 was prepared. Various matrices were used for this Example and Examples 19-21 such 
as 3,5-dihydroxybenzoic acid, sinapinic acid, 3-hydroxypicolinic acid, 2,4,6- 
trihydroxyacetophenone. Oligonucleotides were lyophiltzed after purification by HPLC 
and taken up in ultrapure water (MilliQ, MHlipore) using amounts to obtain a 
concentration of 10 pmoles/p.1 as stock solution. An aliquot (1 pi) of this concentration 

25 or a dilution in ultrapure water was mixed with 1 fil of the matrix solution on a flat metal 
surface serving as the probe tip and dried with a fan using cold air. In some experiments, 
cation-ion exchange beads in the acid form were added to the mixture of matrix and 
sample solution. 

MALDI-TOF spectra were obtained for this Example and Examples 19-21 
30 on different commercial instruments such as Vision 2000 (Finnigan-MAT), VG TofSpec 
(Fisons Instruments), LaserTec Research (Vestec). The conditions for this Example were 
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linear negative ion mode with an acceleration voltage of 25 kV. The MALDI-TOF 
spectrum generated is shown in FIGURE 14. Mass calibration was done externally and 
generally achieved by using defined peptides of appropriate mass range such as insulin, 
gramicidin S, trypsinogen, bovine serum albumen, and cytochrome C. All spectra were 
generated by employing a nitrogen laser with 5 usee pulses at a wavelength of 337 nm. 
Laser energy varied between 10 6 and 10 ? W/cm 2 . To improve signal-to-noise ratio 
generally, the intensities of 10 to 30 laser shots were accumulated. 
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EXAMPLE I? 



Mass Spectro metric Analysis of a 50-mer and a 99-mer 

Two large oligonucleotides were analyzed by mass spectrometry. The 50- 

5 mer 

d (TAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT) 
(SEQ ID NO:3) and dT(pdT>99 were used. The oligodeoxynueleotides were synthesized 
using P-cyanoethylphosphoamidites and purified using published procedures.(e.g. N.D. 
Sinha, J. Biernat, J. McManus and H. Koster, Nucleic Acids Res 12, 4539 (1984)) 

10 employing commercially available DNA synthesizers from either Millipore (Bedford, MA) 
or Applied Biosystems (Foster City, CA) and HPLC equipment and RP18 reverse phase 
columns from Waters (Milford, MA). The samples for mass spectrometric analysis were 
prepared as described in Example 18. The conditions used for MALD1-MS analysis of 
each oligonucleotide were 500 ftnol of each oligonucleotide, reflectron positive ion mode 

15 with an acceleration of 5 kV and postacceleration of 20 kV. The MALDI-TOF spectra 
generated were superimposed and are shown in FIGURE 15. 

EXAMPLE 20 



20 Simulation of the DNA Sequencing Results of FIGURE 2 

The 13 DNA sequences representing the nested dT-terminated fragments 
of the Sanger DNA sequencing for the 50-mer described in Example 19 (SEQ ED NO:3) 
were synthesized as described in Example 19. The samples were treated and 500 fmol of 
each fragment was analyzed by MALD1-MS as described in Example 18. The resulting 

25 MALDI-TOF spectra are shown in FIGURE 16. The conditions were reflectron positive 
ion mode with an acceleration of 5 kV and postacceleration of 20 kV. Calculated 
molecular masses and experimental molecular masses are shown in Table I 

The MALDI-TOF spectra were superimposed (FIGURE 1 7) to 
demonstrate that the individual peaks are resolvable even between the 10-mer and 1 1-mer 

30 (upper panel) and the 37-mer and 38-mer (lower panel). The two panels show two 
different scales and the spectra analyzed at that scale. 
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EXAMPLE 21 

MALDI-MS Analysis of a Mass-Modified Oligonucleotide 

5 A 17-mer was mass-modified at C-5 of one or two deoxyuridine moieties. 

5-[ 13-(2-Methoxyethoxyl)-tridecyne- 1 -yl]-5 , -0-(4 > 4 -dimethoxytrityI)-2 , -deoxyu^idine-3 , - 
P-cyanoethyl-N, N-diisopropyiphosphoamidtte was used to synthesize the modified 17- 
mers using the methods described in Example 19. 
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The modified 1 7-mers were 

X 

I 

a: d (TAAAACGACGGCCAGUG) (molecular mass: 5454) 
5 (SEQ ID NO:4) 

X X - 

I _ I 
b: d (UAAAACGACGGCCAGUG) (molecular mass 5634) 
10 (SEQ ID NO:5) 

where X = -OCHCHj)] r OH 

(unmodified 17-mer: molecular mass: 5273) 

15 

The samples were prepared and 500 finol of each modified 1 7-mer was 
analyzed using MALDI-MS as described in Example 18. The conditions used were 
reflectron positive ion mode with an acceleration of 5 EV and post acceleration of 20 kV. 
The MALDI-TOF spectra which were generated were superimposed and are shown in 
20 FIGURE 18. 

EXAMPLE 22 

Detection of Polymerase Chain Reaction Products Containing 7-Deazapurine 

25 

MATERIALS AND METHODS 
PCR amplifications 

The following oligodeoxynucleotide primers were either synthesized 
according to standard phosphoamidite chemistry (Sinha, N.D.. et al., (1983) Tetrahedron 
30 Let. Vol. 24, Pp. 5843-5846; Sinha, N.D., et al., (1984) Nucleic Acids Res.. Vol. 12, Pp. 
4539-4557) on a MilliGen 7500 DNA synthesizer (Millipore, Bedford, MA, USA) in 200 
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nmol scales or purchased from MWG-Biotech (Ebersberg, Germany, primer 3) and 
Biometra (Goettingen, Germany, primers 6-7). 

primer 1: 5' - GTCACCCTCGACCTGCAG SEQ. ID. NO. 6); 
5 primer 2: 5 • - TTGTAAAACGACGGCCAGT (SEQ. ED. NO. 7); 
primer 3: 5'- CTTCCACCGCGATGTTGA (SEQ. ID. NO. 8); 
primer.4: 5 » - CAGGAAACAGCTATGAC (SEQ. ID. NO. 9); 
primer 5: 5'- GTAAAACGACGGCCAGT (SEQ. ID. NO. 10); 
primer 6: 5 ' - GTCACCCTCGACCTGCAgC (g: RiboG) (SEQ. ID. NO. 1 1); 
10 primer 7. 5'- GTTGTAAAACGAGGGCCAgT (g: RiboG) (SEQ. ED. NO. 12); 

The 99-mer and 200-mer DNA strands (modified and unmodified) as well 
as the ribo- and 7-deaza-modified 100-mer were amplified from pRFcl DNA (10 ng, 
generously supplied S. Feyerabend, University of Hamburg) in 100 uL reaction volume 
containing 10 mmol/L KC1, 10 mmol/L (NH 4 )2S0 4 , 20 mmol/L Tris HCI (pH = 8.8), 2 
15 mmol/L MgS0 4 , (eKO(~)Pseudococcusjuriosus (PJu) -Buffer, Pharmacia, Freiburg, 

Germany), 0.2 mmol/L each dNTP (Pharmacia, Freiburg, Germany), 1 pmol/L of each 
primer and \ unit of exo(-)P/u DNA polymerase (Stratagene. Heidelberg, Germany). 

For the 99-mer primers 1 and 2, for the 200-mer primers 1 and 3 and for 
the 100-mer primers 6 and 7 were used. To obtain 7-deazapurine modified nucleic acids, 
20 _ during PCR-amplification dATP and dGTP were replaced with 7-deaza-dATP and 7- 
deaza=dGTP. The reaction was performed in a thermal cycler (OmniGene^ MWG- 
Biotech, Ebersberg, Germany) using the cycle: denaturation at 95 °C for 1 min., annealing 
at 5 1 °C for 1 min. and extension at 72°C for 1 min. For all PCRs the number of reaction 
cycles was 30. The reaction was allowed to extend for additional 10 min. at 72 °C after 
25 the last cycle. 

The 103-mer DNA strands (modified and unmodified) were amplified 
from M13mpl8 RFI DNA (100 ng, Pharmacia, Freiburg, Germany) in 100 uL reaction 
volume using primers 4 and 5 all other concentrations were unchanged. The reaction was 
performed using the cycle: denaturation at 95 °C for 1 min.. annealing at 40°C for 1 min. 
30 and extension at 72°G for 1 min. After 30 cycles for the unmodified and 40 cycles for 
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the modified 103-mer respectively, the samples were incubated for additional 10 min. at 
72°C. 

Synthesis of S'-^-PJ-labeled PCR-primers 
5 Primers 1 and 4 were 5 -[ 32 -P]-labeled employing T4-polynucIeotidkinase 

(Epicentre Technologies) and (y- 32 P)-ATP. (BLU/NGG/502A, Dupont, Germany) 
according to the protocols of the manufacturer. The reactions were performed 
substituting 10% of primer 1 and 4 in PCR with the labeled primers under otherwise 
unchanged reaction-conditions. The amplified DNAs were separated by gel 
10 electrophoresis on a 10% polyacrylamide gel. The appropriate bands were excised and 
counted on a Packard TRI-CARB 460C liquid scintillation system (Packard, CT, USA). 

Primer-cleavage from ribo-modified PCR-product 
The amplified DNA was purified using Ultrafree-MC filter units (30,000 
15 NMWL), it was then redissolved in 100 \i\ of 0.2 mol/L NaOH and heated at 95 °C for 25 
minutes. The solution was then acidified with HC1 (1 mol/L) and further purified for 
MALDI-TOF-analysis employing Ultrafree-MC filter units (10,000 NMWL) as described 
below. 

20 Purification of PCR products 

All samples were purifiecharcfc concentrated using Ultrafree-MC units 
30000 NMWL (Millipore, Eschborn, Germany) according to the manufacturer's 
description. After lyophilisation, PCR products were redissolved in 5 jiL (3 \ih for the 
200-mer) of ultrapure water. This analyte solution was directly used for MALDI-TOF 

25 measurements. 

MALDI- TOF MS 

Aliquots of 0.5 \iL of analyte solution and 0.5 \xL of matrix solution (0.7 
mol/L 3-HPA and 0.07 mol/L ammonium citrate in acetonitrile/water (1:1, v/v)) were 
30 mixed on a flat metallic sample support After drying at ambient temperature the sample 
was introduced into the mass spectrometer for analysis. The MALDI-TOF mass 
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spectrometer used was a Finnigan MAT Vision 2000 (Finnigan MAT, Bremen, 

Germany). Spectra were recorded in the positive ion reflector mode with a 5 keV ion 

source and 20 keV postacceleration. The instrument was equipped with a nitrogen laser 

-8 

(337 nm wavelength). The vacuum of the system was 3-4* 1 o" hPa in the analyzer 
region and 1-4- 10 hPa in the source region. Spectra of modified and unmodified DNA 
samples were obtained with the same relative laser power; external calibration was 
performed with a mixture of synthetic oligodeoxynucleotides (7-to50-mer). 

RESULTS AND niSCUSSION 

Enzymatic synthesis of 7-deazapurine nucleotide containing nucleic 
acids by PCR 

In order to demonstrate the feasibility of MALDI-TOF MS for the rapid, 
gel-free analysis of short PCR products and to investigate the effect of 7-deazapurine 
modification of nucleic acids under MALDI-TOF conditions, two different primer- 

15 template systems were used to synthesize DNA fragments. Sequences are displayed in 
Figures 24 and 25. While the two single strands of the 103-mer PCR product had nearly 
equal masses (Am= 8 u), the two single strands of the 99-mer differed by 526 u. 

Considering the facts that 7-deaza purine nucleotide building blocks for 
chemical DNA synthesis are approximately 160 times more expensive than regular ones 

20 (Product Information, Glen Research Corporation, Sterling, VA) and their application in 
standard P-cyano-phosphoamidite chemistry is not trivial (Product Information, den 
Research Corporation, Sterling, VA; Schneider , K and B.T. Chait (1995) Nucleic Acids 

1570) the cost of 7-deaza purine modified primers would be very high. 
Therefore, to increase the applicability and scope of the method, all PCRs were 

25 performed using unmodified oligonucleotide primers which are routinely available. 

Substituting dATP and dGTP by c 7 -dATP and c 7 -dGTP in polymerase chain reaction led 
to products containing approximately 80% 7-deaza-purine modified nucleosides for the 
99-mer and 103-mer, and about 90% for the 200-mer, respectively. Table II shows the 
base composition of all PCR products. 

30 
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TABLE D 

Base composition of the 99- mer, 103-mer and 200-mer PCR amplification products 
(unmodified and 7-deaza purine modified) 
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50 
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28 
28 


23 
23 


24 
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28 
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23 


79% 


103-mer s 
103-mer a 
modified 


28 
28 


24 
24 


23 
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28 
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16 


24 


78% 


103-mer a 
9 9 -mer s 
modified 99- 


34 
34 


21 
21 


24 
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20 
5 


18 


15 


75* 


mer s 
99-mer a 
modified 99- 


20 
20 


24 
24 


21 
3 


34 
4 


18 


30 


87% 


j mer a 

















"s M and "a" describe "sense" and "antisense" strands of the double-stranded PCR 

product. 
2 . 

indicates relative modification as percentage of 7-deaza purine modified nucleotides of 
total amount of purine nucleotides. 



30 However, it remained to be determined whether 80-90% 7-deaza-purine 

modification would be sufficient for accurate mass spectrometer detection. It was 

therefore important to determine whether all purine nucleotides could be substituted 

during the enzymatic amplification step. It was found that exo(-)Pseudococcus furiosus 

7 7 

(PJu) DNA polymerase indeed could accept c ^dATP and c -dGTP in the absence of 
35 unmodified purine triphosphates. However, the incorporation was less efficient leading 
to a lower yield of PCR product (Figure 26). Ethidium-bromide stains by intercalation 
with the stacked bases of the DNA-doublestrand. Therefore lower band intensities in the 
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ethidium-bromide stained gel might be artifacts since the modified DNA-strands do not 
necessarily need to give the same band intensities as the unmodified ones. 

To verify these results, the PCRs with [ 32 P]-labeled primers were 
5 repeated. The autoradiogram (Figure 27) clearly shows lower yields for the modified 

PCR-products. The bands were excised from the gel and counted. For all PCR products 
the yield of the modified nucleic acids was about 50%, referring to the corresponding 
unmodified amplification product. Further experiments showed that eKo(-)DeepVent and 
Vent DNA polymerase were able to incorporate c 7 -d ATP and c 7 -dGTP during PCR as 

1 0 well. The overall performance, however, turned out to be best for the exo(-)Pfu DNA 
polymerase giving least side products during amplification. Using all three polymerases, 
it was found that such PCRs employing c 7 -dATP and c 7 -dGTP instead of their isosteres 
showed less side-reactions giving a cleaner PCR-product. Decreased occurrence of 
amplification side products may be explained by a reduction of primer mismatches due to 

1 5 a lower stability of the complex formed from the primer and the 7-deaza-purine 

containing template which is synthesized during PCR Decreased melting point for DNA 
duplexes containing 7-deaza-purine have been described (Mizusawa, S. et at., (1986) 
Nucleic Acids Res.. 14, 1 3 1 9- 1 324). In addition to the three polymerases specified above 
(exo(-) Deep Vent DNA polymerase. Vent DNA polymerase and exo(-) (P/uJ DNA 

20 polymerase), it is anticipated that other polymerases, such as the Large Klenow fragment 
of E.coli DNA polymerase, Se que nose, Taq DNA polymerase, and U AmpliTaq, 
AmpIiTaq or AmpliTaq TS DNA polymerase can be used. In addition, where RNA is the 
template, RNA polymerases, such as the SP6 or the T7 RNA polymerase, must be used 

25 MALD1-TOF mass spectrometry of modified and unmodified PCR 

products. 

The 99-mer, 103-mer and 200-mer PCR products were analyzed by 
MALDI-TOF MS. Based on past experience, it was known that the degree of 
depurination depends on the laser energy used for desorption and ionization of the 
30 analyte. Since the influence of 7-deazapurine modification on fragmentation due to 
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depurination was to be investigated, all spectra were measured at the same relative laser 
energy. 



unmodified 103-mer nucleic acids. In case of the modified 103-mer, fragmentation 
5 causes a broad (M+H) signal. The maximum of the peak is shifted to lower masses so 
that the assigned mass represents a mean value of (M+H) signal and signals of 
fragmented ions, rather than the (M+H) + signal itself. Although the modified 103-mer 
still contains about 20% A and G from the oligonucleotide primers, it shows less 
fragmentation which is featured by much more narrow and symmetric signals. Especially 

10 peak tailing on the lower mass side due to depurination, is substantially reduced. Hence, 
the difference between measured and calculated mass is strongly reduced although it is 
still below the expected mass. For the unmodified sample a (M+H) + signal of 3 1 670 was 
observed, which is a 97 u or 0.3% difference to the calculated mass. While, in case of the 
modified sample this mass difference diminished to 10 u or 0.03% (3 1713 u found, 31723 

15 u calculated). These observations are verified by a significant increase in mass resolution 
of the (M+H) signal of the two signal strands (m/Am = 67 as opposed to 1 8 for the 
unmodified sample with Am - full width at half maximum, fwhm). Because of the low 
mass difference between the two single strands (8 u) their individual signals were not 
resolved. 

20 With the results of the 99 base pair DNA fragments the effects of 

increased mass resolution for 7-deazapurine containing DNA becomes even more 
evident. The two single strands in the unmodified sample were not resolved even though 
the mass difference between the two strands of the PCR product was very high with 526 
u due to unequal distribution of purines and pyrimidines (figure 29a). In contrast to this, 

25 the modified DNA showed distinct peaks for the two single strands (figure 29b) which 
makes the superiority of this approach for the determination of molecular weights to gel 
electrophoretic methods even more profound. Although base line resolution was not 
obtained the individual masses were abled to be assigned with an accuracy of 0. 1%: Am 
= 27 u for the lighter (calc. mass = 30224 u) and Am = 14 u for the heavier strand (calc. 

30 mass = 30750 u). Again, it was found that the full width at half maximum was 
substantially decreased for the 7-deazapurine containing sample. 



Figures 28a and 28b show the mass spectra of the modified and 
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In case of both the 99-mer and 103-mer the 7-deazapurine containing 
nucleic acids seem to give higher sensitivity despite the fact that they still contain about 
20% unmodified purine nucleotides. To get comparable signal-to-noise ratio at similar 
intensities for the (M+H) signals, the unmodified 99-mer required 20 laser shots in 
contrast to 12 for the modified one and the 103-mer required 12 shots for the unmodified 
sample as opposed to three for the 7-deazapurine nucleoside-containing PCR product. 

Comparing the spectra of the modified and unmodified 200-mer 
amplicons, improved mass resolution was again found for the 7-deazapurine containing 
sample as well as increased signal intensities (figures 30a and 30b). While the signal of 
the single strands predominates in the spectrum of the modified sample the DNA-suplex 
and dimers of the single strands gave the strongest signal for the unmodified sample. 

A complete 7-deaza purine modification of nucleic acids may be achieved 
either using modified primers in PCR or cleaving the unmodified primers from the 
partially modified PCR product. Since disadvantages are associated with modified 
primers, as described above, a 1 00-mer was synthesized using primers with a ribo- 
modification. The primers were cleaved hydrolytically with NaOH according to a method 
developed earlier in our laboratory (Koester, R et aJ., Z Physiol. Chem. 359:1570- 
1589). Figures 31a and 31b display the spectra of the PCR product before and after 
primer cleavage. Figure 3 lb shows that the hydrolysis was successful: Both hydrolyzed 
PCR product as well as the two released primers could be detected together with a small 
signal from residual uncleaved 1 00-mer. This procedure is especially useful for the 
MALDI-TOF analysis of very short PCR-products since the share of unmodified purines 
originating from the primer increases with decreasing length of the amplified sequence. 

The remarkable properties of 7-deazapurine modified nucleic acids can be 
explained by either more effective desorption and/or ionization, increased ion stability 
and/or a lower denaturation energy of the double stranded purine modified nucleic acid. 
The exchange of the N-7 for a methine group results in the loss of one acceptor for a 
hydrogen bond which influences the ability of the nucleic acid to form secondary 
structures due to non-Watson-Crick base pairing (Seela, F. and A. Kehne (1987) 
Biochemistry, 26, 2232-2238 ), which should be a reason for better desorption during the 
MALDI process. In addition to this the aromatic system of 7-deazapurine has a lower 
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electron density that weakens Watson-Crick base pairing resulting in a decreased melting 
point (Mizusawa, S. et al., (1986) Nucleic Acids Res. , 14, 1319-1324) of the double- 
strand. This effect may decrease the energy needed for denaturation of the duplex in the 
MALDI process. These aspects as well as the loss of a site which probably will carry a 
5 positive charge on the N-7 nitrogen renders the 7-deazapurine modified nucleic acid less 
polar and may promote the effectiveness of desorption. 

Because of the absence of N-7 as proton acceptor and the decreased 
polarizaiton of the C-N bond in 7-deazapurine nucleosides depurination following the 
mechanisms established for hydrolysis in solution is prevented. Although a direct 
10 correlation of reactions in solution and in the gas phase is problematic, less fragmentation 
due to depurination of the modified nucleic acids can be expected in the MALDI process. 
Depurination may either be accompanied by loss of charge which decreases the total yield 
of charged species or it may produce charged fragmentation products which decreases 
the intensity of the non fragmented molecular ion signal. 

15 The observation of both increased sensitivity and decreased peak tailing of 

+ 

the (M+H) signals on the lower mass side due to decreased fragmentation of the 7- 
deazapurine containing samples indicate that the N-7 atom indeed is essential for the 
mechanism of depurination in the MALDl-TOF process. In conclusion, 7-deazapurine 
containing nucleic acids show distinctly increased ion-stability and sensitivity under 
20 MALDI-TOF conditions and therefore provide for higher mass accuracy and mass 
resolution. 

EXAMPLE 23 

25 Solid Stale Sequencing and Mass Spectrometer Detection 

MATERIALS AN D METHODS 

Oligonucleotides were purchased from Operon Technologies (Alameda, 
CA) in an un purified form. Their sequences are listed in Table III. Sequencing reactions 
30 were performed on a solid surface using reagents from the sequencing kit for Sequenase 
Version 2.0 ( Amersham, Arlington Heights, Illinois) 
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10 



15 



20 



25 



Sequencing a 39-mer target 

Sequencing complex: 
5 ' -TCTGGCCTGGTGCAGGGCCTATTGTAGTTGTGACGTACA- (A b ) a -3 ■ 
(DNA1 1683) (SEQ. ID. No. 13) 

3 ' TCAACACTGCATGT-5 ■ (PNA16/DNA) (SEQ. ID. No. 14) 

In order to perform solid-state DNA sequencing, template strand 
DNA1 1683 was 3'-biotinylated by terminal deoxynucleotidyl transferase A 30 M I 
reaction, containing 60 pmol of DNA1 1683, 1.3 nmol of biotin 14-dATP (GIBCO BRL, 
Grand Island, NY), 30 units of terminal transferase (Amersham, Arlington Heights, 
Illinois), and lx reaction buffer (supplied with enzyme), was incubated at 37 °C for 1 
hour. The reaction was stopped by heat inactivation of the terminal transferase at 70 °C 
for 10 min. The resulting product was desalted by passing through a TE-10 spin column 
(Clonetech). More than one molecules of biotin- 14-dATP could be added to the 3 -end 
of DNAI1683. The biotinylated DNA1 1683 was incubated with 0.3 mg of Dynal 
streptavidin beads in 30 pi lx binding and washing buffer at ambient temperature for 30 
min. The beads were washed twice with TE and redissolved in 30 nl TE, 10 ^1 aliquot 
(containing 0. 1 mg of beads) was used for sequencing reactions. 

The 0.1 mg beads from previous step were resuspended in a 10^1 volume 
containing 2 >il of 5x Sequenase buffer (200 mM Tris-HCI, pH 7.5, 100 mM MgC12, and 
250 mM NaCl) from the Sequenase kit and 5 pmol of corresponding primer 
PNA16/DNA. The annealing mixture was heated to 70°C and allowed to cool slowly to 
room temperature over a 20-30 min time period. Then 1 jil 0.1 M dithiothreitol solution, 
1 jil Mn buffer (0. 15 M sodium isocitrate and 0. 1 M McC12), and 2 *il of diluted 
Sequenase (3.25 units) were added. The reaction mixture was divided into four aliquots 
of 3 til each and mixed with termination mixes (each consists of 3 \xl of the appropriate 
termination mix: 32 \iM c7dATP, 32 nM dCTP, 32 \iM c7dGTP, 32 jiM dTTP and 3.2 
HM of one of the four ddTNPs, in 50 mM NaCl). The reaction mixtures were incubated 
at 37°C for 2 min. After the completion of extension, the beads were precipitated and 
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the supernatant was removed. The beads were washed twice and resuspended in TE and 
kept at 4°C. 



Sequencing a 78-mer targ et 
5 Sequencing complex: 

5 1 -AAGATCTGACCAGGGATTCGGTTAGCGTGACTGCTGCTGCTGCTGCTGCTGC 
TGGATGATCCGACGCATCAGATCTGG- (A b ) n -3 (SEQ. ID. NO. 15) (TNR.PLASM2) 
3 '-CTACTAGGCTGCGTAGTC-5 ' (CM1 ) (SEQ. ID. NO. 16) 

The target TNR.PLASM2 was biotinylated and sequenced using 
10 procedures similar to those described in previous section (sequencing a 39-mer target) 

Sequencing a 15-mer target with partially duplex probe 

Sequencing complex: 

5 3' (SEQ. ID. No. 17) 

•-F-GATGATCCGACGCATCACAGCTC 

3 ' 3 • (SEQ. ED. No 18) 

15 -b-CTACTAGGCTGCGTAGTGTCGAGAACCTTGGCT 



CM1B3B was immobilized on Dynabeads M280 with streptavidin (Dynal, 
Norway) by incubating 60 pmol of CM1B3B with 0.3 magnetic beads in 30 \il 1M NaCI 
and TE ( lx binding and washing buffer) at room temperature for 30 min. The beads 

20 wefe washed twice with TE and redissolved in 30 t*l TE, 10 or 20 til aliquot (containing 
0.1 or 0.2 mg of beads respectively) was used for sequencing reactions. 

The duplex was formed by annealing corresponding aliquot of beads from 
previous step with 10 pmol of DFl la5F (or 20 pmol of DF1 la5F for 0.2 mg of beads) in 
a 9 >il volume containing 2 til of 5x Sequenase buffer (200 mM Tris-HCI, pH 7.5, 100 

25 rnM MgCll, and 250 mM NaCI) from the Sequenase kit. The annealing mixture was 

heated to 65°C and allowed to cool slowly to 37°C over a 20-30 min time period The 
duplex primer was then mixed with 10 pmol of TSlo (20 pmol of TS10 for 0.2 mg of 
beads) in 1 til volume, and the resulting mixture was further incubated at 37°C for 5 min, 
room temperature for 5-10 min. Then 1 yi\ 0.1 M dithiothreitol solution, 1 til Mn buffer 

30 (0. 15 M sodium isocitrate and 0.1 M MnCl 2 ), and 2 fil of diluted Sequenase (3.25 units) 
were added. The reaction mixture was divided into four aliquots of 3 each and mixed 
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with termination mixes (each consists of 4 pi of the appropriate termination mix: 16 pM 
dATP, 16 pM dCTP, 16 pM dGTP, 16 pM dTTP and 1.6 pM of one of the four 
ddNTPs, in 50 mM NaCl). The reaction mixtures were incubated at room temperature 
for 5 min, and 3 7° C for 5 min. After the completion of extension, the beads were 
precipitated and the supernatant was removed. The beads were resuspended in 20 pi TE 
and kept at 4°C. An aliquot of 2 pi (out of 20 pi) from each tube was taken and mixed 
with 8 pi of formamide, the resulting samples were denatured at 90-95 °C for 5 min and 2 
pi (out of 10 pi total) was applied to an ALF DNA sequencer (Pharmacia, Piscataway, 
NJ) using a 10% polyacrylamide gel containing 7 M urea and 0.6x TBE. The remaining 
aliquot was used for MALDI-TOFMS analysis. 

MALDI sample preparation and instrumentation 
Before MALDI analysis, the sequencing ladder loaded magnetic beads 
were washed twice using 50 mM ammonium citrate and resuspended in 0.5 pi pure 
water. The suspension was then loaded onto the sample target of the mass spectrometer 
and 0.5 pi of saturated matrix solution (3-hydropicolinic acid (HP A): ammonium citrate 
= 10: 1 jnole ratio in 50% acetonitrile) was added. The mixture was allowed to dry prior 
to mass spectometer analysis. 

The reflectron TOFMS mass spectrometer (Vision 2000, Finnigan MAT, 
Bremen, Germany) was used for analysts. 5 kV was applied in the ion source and 20 kV 
was applied for postacceleration. All spectra were taken in the positive ion mode and a 
nitrogen laser was used. Normally, each spectrum was averaged for more than 100 shots 
and a standard 25-point smoothing was applied. 

RESULTS A ND DISCUSSIONS 

Conventional solid-state sequencing 

In conventional sequencing methods, a primer is directly annealed to the 
template and then extended and terminated in a Sanger dideoxy sequencing. Normally, a 
biotinylated primer is used and the sequencing ladders are captured by streptavidin- 
coated magnetic beads. After washing, the products are eluted from the beads using 
EDTA and formamide. However, our previous findings indicated that only the annealed 
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strand of a duplex is desorbed and the immobilized strand remains on the beads (Tang, K. 
et ai. f (1995) Nucleic Acids Research 23:3126-3131). Therefore, it is advantageous to 
immobilize the template and anneal the primer. After the sequencing reaction and 
washing, the beads with the immobilized template and annealed sequencing ladder can be 
5 loaded directly onto the mass spectrometer target and mix with matrix. In MALDI, only 
the annealed sequencing ladder will be desorbed and ionized, and the immobilized 
template will remain on the target. 

A 39-mer template (SEQ. ID. No. 13) was first biotinylated at the 3' end 
by adding biotin- 14-dATP with terminal transferase. More than one biotin- 14-dATP 

10 molecule could be added by the enzyme. However, since the template was immobilized 
and remained on the beads during MALDI, the number of biotin- 14-dATP would not 
affect the mass spectra. A 14-mer primer (SEQ. ED. No. 14) was used for the solid-state 
sequencing. MALDI-TOF mass spectra of the four sequencing ladders are shown in 
Figure 32, and the expected theoretical values are shown in Table III. The sequencing 

15 reaction produced a relatively homogenous ladder, and the full-length sequence was 

determined easily. One peak around 5150 appeared in all reactions are not identified. A 
possible explanation is that a small portion of the template formed some kind of 
secondary structure, such as a loop,-which hindered sequenase extension. Mis- 
incorporation is of minor importance, since the intensity of these peaks were much lower 

20 than that of the sequencing ladders. Although 7-deaza purines were used in the 
sequencing reaction, which could stabilize the N-glycosidic bond and prevent 
depurination minor base losses were still observed since the primer was not substituted 
by 7-deazapurines. The full length ladder, with a ddA at the 3' end, appeared in the A 
reaction with an apparent mass of 1 1899.8. However, a more intense peak of 122 

25 appeared in all four reactions and is likely due to an addition of an extra nucleotide by the 
Sequenase enzyme. 
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The same technique could be used to sequence longer DNA fragments. A 
78-mer template containing a CTG repeat (SEQ. ID. No. 15) was 3*-biotinyiated by 
adding biotin-14-dATP with terminal transferase. An 18-mer primer (SEQ. ID. No. 16) 
was annealed right outside the CTG repeat so that the repeat could be sequenced 
immediately after primer extension. The four reactions were washed and analyzed by 
MALDI-TOFMS as usual. An example of the G-reaction is shown in Figure 33 and the 
expected sequencing ladder is shown in Table IV with theoretical mass values for each 
ladder component. All sequencing peaks were well resolved except the last component 
(theoretical value 20577.4) was indistinguishable from the background. Two neighboring 
sequencing peaks (a 62-mer and a 63-mer) were also separated indicating that such 
sequencing analysis could be applicable to longer templates. Again, an addition of an 
extra nucleotide by the Sequenase enzyme was observed in this spectrum. This addition 
is not template specific and appeared in all four reactions which makes it easy to be 
identified. Compared to the primer peak, the sequencing peaks were at much lower 
intensity in the long template case. Further optimization of the sequencing reaction may 
be required. 
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TABLE XV (Continued) 
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Se<wencm2 usinz duplex DNA probes for capturing and priminZ 
Duplex DNA probes with single-stranded overhang have been demonstrated to 
be able to capture specific DNA templates and also serve as primers for solid-state sequencing. 
The scheme is shown in Figure 34. Stacking interactions between a duplex probe and a single- 
5 stranded template allow only 5-base overhand to be sufficient for capturing. Based on this 
format, a 5' fluorescent-labeled 23-mer (5*-GAT GAT CCG ACG CAT CAC AGC TC) (SEQ 
ID. No. 19) was annealed to a 3'-biotinylated 18-mer (5*-GTG ATG CGT CGG ATC ATC) 
(SEQ. ID. No. 20), leaving a 5-base overhang. A 15-mer template (5'-TCG GTT CCA AGA 
GCT) (SEQ ID. No. 21) was captured by the duplex and sequencing reactions were performed 

10 by extension of the 5-base overhang. MALDI-TOF mass spectra of the reactions are shown in 
Figure 35. All sequencing peaks were resolved although at relatively low intensities The last 
peak in each reaction is due to unspecific addition of one nucleotide to the full length extension 
product by the Sequenase enzyme. For comparison, the same products were run on a 
conventional DNA sequencer and a stacking fluorogram of the results is shown in Figure 36. 

15 As can be seen from the Figure, the mass spectra had the same pattern as the fluorogram with 
sequencing peaks at much lower intensity compared to the 23-mer primer. 

Imp rQVzrntnts of MALDf-TOF mass spectrometry as a detection technique 
Sample distribution can be made more homogenous and signal intensity could 

20 potentially be increased by implementing the picoliter vial technique. In practice, the samples 
can be loaded on small pits with square openings of 100 urn size. The beads used in the solid- 
state sequencing is less than 10 urn in diameter, so they should fit well in the microliter vials. 
Microcrystals of matrix and DNA containing " sweet spots* will be confined in the vial. Since 
the laser spot size is about 100 jim in diameter, it will cover the entire opening of the vial. 

25 Therefore, searching for sweet spots will be unnecessary and high repetition-rate laser (e.g. 
>10Hz) can be used for acquiring spectra. An earlier report has shown that this device is 
capable of increasing the detection sensitivity of peptides and proteins by several orders of 
magnitude compared to conventional MALDI sample preparation technique. 

Resolution of MALDI on DNA needs to be further improved in order to extend the 

30 sequencing range beyond 100 bases. Currently, using 3-HP A/ammonium citrate as matrix and a 
reflectron TOF mass spectrometer with 5kV ion source and 20 kV postacceleration, the 
resolution of the run-through peak in Figure 33 (73-mer) is greater than 200 (FWHM) which is 
enough for sequence determination in this case. This resolution is also the highest reported for 
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MALDI desorbed DNA ions above the 70-mer range. Use of the delayed extraction technique " 
may further enhance resolution. 

Ail of the above-cited references and publications are hereby incorporated by 

reference. 

5 EQUIVALENTS 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, numerous equivalents to the specific procedures described herein. 
Such equivalents are considered to be within the scope of this invention and are covered by the 
following claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 



10 



25 



30 



(i) APPLICANT: Koster, Hubert 
Ui) TITLE OF INVENTION: DNA SEQUENCING BY MASS SPECTROMETRY 
(iii) NUMBER OF SEQUENCES: 21 



(iv) CORRESPONDENCE ADDRESS: 

<A> ADDRESSEE: Patent Group 

Foley, Hoag & Eliot LLP 
(B) STREET: One Post Office Square 
15 (C| CITY: Boston 

(D) STATE: MA 

(E) COUNTRY: USA 

{ F) ZIP: 02109-2170 

20 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: ASCII (text) 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 18-MAR-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/617,010 

(B) FILING DATE: 18-MAR-1996 



35 (viii) PRIOR APPLICATION DATA: 

CA) APPLICATION NUMBER: 08/178,216 

(B) FILING DATE: 06-JAN-1994 

(C) CLASSIFICATION: 



40 (ix) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Arnold, Beth E. 

(B) REGISTRATION NUMBER : 35,430 

(C) REFERENCE/ DOCKET NUMBER: SQA-3.25.27 
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(X) TELECOMMUNICATION INFORMATION : 
(A) TELEPHONE: (617) 832-1294 
<B) TELEFAX: (617) 832-7000 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
10 ... (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



(ii> MOLECULE TYPE: other nucleic acid 



(iii) HYPOTHETICAL: YES 



20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



CATGCCATGG CATG 



14 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
- ■ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: YES 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



AAATTGTGCA CATCCTGCAG C 



40 



(2) INFORMATION FOR SEQ ID NO: 3: 



21 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid. 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

<ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TAACGGT CAT TACGGCCATT GACTGTAGGA CCTGCATTAC ATGACTAGCT 



<2> INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
(O STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 



(iii) HYPOTHETICAL: YES 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



TAAAACGACG GGCCAGXG 

35 

(2) INFORMATION FOR SEQ ID NO: 5 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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10 

(2) INFORMATION FOR SEQ ID NO: 6: 

<i) SEQUENCE CHARACTERISTICS: 
15 <A> LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: cDNA 



25 



30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: cDNA 
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(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: YES 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
XAAAACGACG GGCCAGXG 17 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GTCACCCTCG ACCTGCAG I 8 
<2) INFORMATION FOR SEQ ID NO: 7: 



40 <xi> SEQUENCE DESCRIPTION: SEQ ID NO:7: 

TTGTAAAACG ACGGCCAGT 19 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

1 5 CTTCCACCGC GATGTTGA 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: . SEQ.-ID NO : 9 : . 



CAGGAAACAG CTATGAC 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 



10 



30 



(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: cDNA 
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<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GTAAAACGAC GGCCAGT 17 

5 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
(B> TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(ix) FEATURE: 

(A) NAME/ KEY : misc_feature 
(BJ LOCATION: 1 ... 19 

(D) OTHER INFORMATION: /note= "All lowercase letters 
20 represent RiboG" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

25 GTCACCCTCG ACCTGCAgC 

(2) INFORMATION FOR SEQ ID NO: 12: 

Ti) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/ KEY : misc_feature 
40 (B) LOCATION: 1..20 

(D) OTHER INFORMATION: /note- "All lowercase letters 
represent RiboG" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GTTGTAAAAC GAGGGCCAgT 2 0 

5 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TCTGGCCTGG TGCAGGGCCT ATTGTAGTTG TGACGTACA 39 
20 (2) INFORMATION FOR SEQ ID NO: 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS: single 

{ D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TCAACACTGC ATGT 
35 (2) INFORMATION FOR. SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



14 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
5 AAGATCTGAC CAGGGATTCG GTTAGCGTGA CTGCTGCTGC TGCTGCTGCT GCTGGATGAT 60 
CCGACGCATC AGATCTGG 7 8 

(2) INFORMATION FOR SEQ ID NO: 16: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CTACTAGGCT GCGTAGTC 18 
25 (2) INFORMATION FOR SEQ ID NO: 17: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 
30 (C> STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GATGATCCGA CGCATCACAG CTC 23 

40 (2> INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
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(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii> MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
10 CTACTAGGCT GCGTAGTGTC GAGAACCTTG GCT 
(2) INFORMATION FOR SEQ ID NO: 19: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 (ii) MOLECULE TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

25 

GATGATCCGA CGCATCACAG CTC 

(2) INFORMATION FOR SEQ ID NO: 20: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

40 

GTGATGCGTC GGATCATC 

(2) INFORMATION FOR SEQ ID NO: 21: 



33 



23 



18 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 ( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 
TCGGTTCCAA GAGCT 15 
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CLAIMS 

1 A method for determining the sequence of a nucleic acid, comprising the 
steps of: 

a) generating at least two base-specifically terminated nucleic acid 
fragments containing modified purine nucleotides that are relatively 
resistant to fragmentation during mass spectrometry; 

b) determining the molecular weight value of each base-specifically 
terminated fragment by mass spectrometry, wherein the molecular weight 
values of at least two base-specifically terminated fragments are 
determined concurrently; and 

c) determining the sequence of the nucleic acid by aligning the base-specifically 
terminated nucleic acid fragments according to molecular weight 

2. The method according to claim 1 wherein the nucleic acid fragments are 
purified before the step of determining the molecular weight values by mass 
spectrometry. 



3. The method according to claim 2 wherein the nucleic acid fragments are 
purified, comprising the steps of: 

a) reversibly immobilizing the nucleic acid fragments on a solid 
support; and 

b) washing out all remaining reactants and by-products. 

4. The method according to claim 3, further comprising the step of removing the 
nucleic acid fragments from the solid support. 
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5. The method of claim 1, wherein the fragments contain deazapurine moieties. 

6. The method of claim 1, wherein the deaza purine moieties are selected from 
5 the group consisting of: C 7 -deazaadenine, C 7 -deazaguanine, 7-deazainosine 

triphosphate, C 9 -deazaadenine, C 9 -deazaguanine and C 9 -deazainosine 
triphosphate. 

7. The method of claim 1, wherein at least about 50% of the purine nucleotides 
10 are modified within the nucleotide fragment. 

8. A process of claim 1 wherein the mass spectrometer is selected from the 
group consisting of: Matrix-Assisted Laser Desorption/Ionization Time-of-Flight 
(MALDI-TOF), Electrospray (ES), Ion Cyclotron Resonance (ICR), and Fourier 

15 Transform and combinations thereof. 

9. The method according to claim I, wherein more than one species of nucleic 
acid are concurrently sequenced by multiplex mass spectrometric nucleic acid 
sequencing employing nucleic acid primers, chain-elongating nucleotides, and 

20 chain-terminating nucleotides, wherein one of the sets of base-specifically 

terminated fragments is unmodified and the other sets of base-specifically 
terminated nucleic acid fragmsaiS are mass modified, and each of the sets of 
base-specifically terminated nucteic acid fragments has a sufficient mass 
difference to be distinguished from the others by mass spectrometry 

25 

10. The method according to claim 9, wherein at least one of the sets of mass- 
modified base-specifically terminated fragments is modified with a mass- 
modifying functionality at a heterocyclic base of at least one nucleotide. 

30 11. The method according to claim 10, wherein the heterocyclic base-modified 

nucleotide is selected from the group consisting of a cytosine nucleotide 
modified at C-5, a thymine nucleotide modified at C-5, a thymine nucleotide 
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modified at the C-5 methyl group, a uracil nucleotide modified at C-5. an adenine 
nucleotide modified at C-8, an adenine nucleotide modified at C-7, a c 7 - 
deazaadenine modified at C-8, a c 7 -deazaadenine modified at C-7, a guanine 
nucleotide modified at C-8, a guanine nucleotide modified at C-7, a c 7 - 
5 deazaguanine modified at C-8, a c 7 -deazaguanine modified at C-7, a 

hypoxanthine modified at C-8, a c 7 -deazahypoxanthine modified at C-7, and a c 7 - 
deazahypoxan thine modified at C-8. 

12. The method according to claim 9, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
mass-modifying functionality attached to one or more phosphate moieties of the 
internucleotidic linkages of the fragments. 

13. The method according to claim 9, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
mass-modifying functionality attached to one or more sugar moieties of 
nucleotides within the set of mass modified base-specifically terminated 
fragments at at least one sugar position selected from the group consisting of a 
C-2' position, an external C- 3 1 position, and an external C-5' position. 

14. The method according to claim 9, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a_ 
mass-modifying functionality (M) attached to the sugar moiety of a 5'-terminal 
nucleotide and wherein the mass-modifying function (M) is the linking 
functionality (L). 

1 5 The method according to claim 9, wherein a mass-modifying functionality 
(M) is attached to a set of base-specifically terminated nucleic acid fragments 
subsequent to generating the base-specifically terminated nucleic acid fragments 
and prior to determining the molecular weight values for the nested fragments by 
mass spectrometry. 
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16. The method according to claim 15, wherein the base-specifically terminated 
nucleic acid fragments are generated using at least one reagent selected from the 
group consisting of a nucleic acid primer, a chain-elongating nucleotide, a chain- 
terminating nucleotide and a tag probe which has been modified with a precursor 
of the mass-modifying functionality, M; and a subsequent step comprises 
modifying the precursor of the mass-modifying functionality to generate the 
mass-modifying functionality, M, prior to mass spectrometry analysis. 

1 7. The method according to claim 9, wherein mass differentiation of the tag 
probes is achieved by changing the nucleotide composition of at least one of the 
tag probes and complementary tag sequence in the species of nucleic acid. 

18. The method according to claim 9, wherein the tag probes are covalently 
bound to the corresponding complementary tag sequence prior to mass 
spectrometry analysis. 

19. The method according to claim 18, wherein binding between the tag 
probes and the corresponding complementary tag sequences is achieved 
photochemically via photoactivatable groups. 

20. A method of sequencing a nucleic acid, comprising the steps of: 

_ a) reversibly linking an oligoimde^ solid support; 

b) generating at least two base-specifically terminated nucleic acid 
fragments containing nucleotides that are relatively resistant to 
fragmentation during mass spectrometry; 

c) determining the molecular weight value of each nested fragment 

in each of the four sets of base-specifically terminated fragments of the nucleic 
acid by matrix assisted laser desorption/ionization mass spectrometry 
wherein the molecular weight values of at least two base-specifically terminated 
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fragments are determined concurrently and wherein the nested fragments are 
cleaved from the solid support by a laser during mass spectrometry; and 

d) determining the nucleotide sequence by aligning the base 
specifically terminated fragments according to molecular weight. 

21 . The method according to claim 20 wherein the nucleic acid fragments are 
purified before the step of determining the molecular weight values by mass 
spectrometry. 

22. The method according to claim 21 wherein the nucleic acid fragments are 
purified, comprising the steps of: 



a) reversibly immobilizing the nucleic acid fragments on a solid 
1 5 support; and 

b) washing out all remaining reactants and by-products. 

23. The method according to claim 22, further comprising the step of 
20 removing the nucleic acid fragments from the solid support. 

24. The method of claim 20, wherein the fragments contain dea zapurine 
moieties. 



25 25. The method of claim 20, wherein the deaza purine moieties are selected 

7 7 
from the group consisting of: C -deazaadenine, C -deazaguanine, 7- 

9 9 " 9 

deazainosine triphosphate, C -deazaadenine, C -deazaguanine and C - 

deazainosine triphosphate. 



30 26. The method of claim 20, wherein at least about 50% of the purine 

nucleotides are modified within the nucleotide fragment. 
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27. A process of claim 20 wherein the mass spectrometer is selected from the 
group consisting of: Matrix-Assisted Laser Desorption/lonization Time-of-Flight 
(MALDI-TOF), Electrospray (ES), Ion Cyclotron Resonance (ICR), and Fourier 
Transform and combinations thereof. 

5 

28. The method according to claim 20, wherein more than one species of nucleic 
acid are concurrently sequenced by multiplex mass spectrometric nucleic acid 
sequencing employing nucleic acid primers, chain-elongating nucleotides, and 
chain-terminating nucleotides, wherein one of the sets of base-specifically 

10 terminated fragments is unmodified and the other sets of base-specifically 

terminated nucleic acid fragments are mass modified, and each of the sets of 
base-specifically terminated nucleic acid fragments has a sufficient mass 
difference to be distinguished from the others by mass 
spectrometry. 

15 

29. The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated fragments is modified with a mass- 
modifying functionality (M) at a heterocyclic base of at least one nucleotide. 

20 30. The method according to claim 29, wherein the heterocyclic base-modified 

nucleotide is selected from the group consisting of a cytosine nucleotide 

modified at C-5, a thymine nucleotide modified at C-S, a thymine nucleotide 

modified at the C-5 methyl group, a uracil nucleotide modified at C-5, an adenine 

nucleotide modified at C-8, an adenine nucleotide modified at C-7, a c - 

7 

25 deazaadenine modified at C-8, a c -deazaadenine modified at C-7, a guanine 

nucleotide modified at C-8, a guanine nucleotide modified at C-7, a c - 

7 

deazaguanine modified at C-8, a c -deazaguanine modified at C-7, a 

7 

hypoxanthine modified at C-8, a c -deazahypoxanthine modified at C-7. and a 
c 7 -deazahypoxanthine modified at C-8. 



30 



3 1 . The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
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mass-modifying functionality (M) attached to one or more phosphate moieties of 
the intemucleotidic linkages of the fragments. 

32. The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
mass-modifying functionality (M) attached to one or more sugar moieties of 
nucleotides within the set of mass modified base-specifically terminated 
fragments at at least one sugar position selected from the group consisting of a - 
C-2' position, an external C- 3' position, and an external C-5' position 



33. The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
mass-modifying functionality (M) attached to the sugar moiety of a 5*-terminal 
nucleotide and wherein the mass-modifying function (M) is the linking 

15 functionality (L). 

34. The method according to claim 28, wherein a mass-modifying functionality 
(M) is attached to a set of base-specifically terminated nucleic acid fragments 
subsequent to generating the base-specifically terminated nucleic acid fragments 
and prior to determining the molecular weight values for the nested fragments by 
mass spectrometry. 



20 



25 



35. The method according to claim 34, wherein the base-specifically 
terminated nucleic acid fragments are generated using at least one reagent 
selected from the group consisting of a nucleic acid primer, a chain-elongating 
nucleotide, a chain-terminating nucleotide and a tag probe which has been 
modified with a precursor of the mass-modifying functionality, M; and a 
subsequent step comprises modifying the precursor of the mass-modifying 
functionality, M, to generate the inass-modifying functionality, M prior to mass 
-*0 spectrometry analysis. 
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36. The method according to claim 28, wherein mass differentiation of the tag 
probes is achieved by changing the nucleotide composition of at least one of the 
tag probes and complementary tag sequence in the species of nucleic acid. 

37. The method according to claim 28, wherein the tag probes are covalently 
bound to the corresponding complementary tag sequence prior to mass 
spectrometric analysis. 

38. The method according to claim 37, wherein binding between the tag probes 
and the corresponding complementary tag sequences is achieved 
photochemically via photoactivatable groups. 

39. A method of multiplex analysis of nucleic acid sequences, comprising the 
steps of: 

a) reversibly linking a nucleic acid primer to a solid support; 

b) generating at least two conditioned, base-specifically terminated nucleic acid 
fragments containing modified purine nucleotides that are relatively resistant to 
fragmentation during mass spectrometry; 

c) determining the molecular weight value of each fragment by matrix assisted 
laser desorption/ionization mass spectrometry wherein the molecular weight 
values of at least two -base^specifical ly terminated fragments are determined 
concurrently and wherein the fragments are cleaved from the solid support by a 
laser during mass spectrometry; and 

d) determining the nucleotide sequence by aligning the fragments according to 
molecular weight; wherein at least one reagent selected from a group consisting 
of, a nucleic acid primer, a chain-elongating nucleotide, and a chain-terminating 
nucleotide which has been mass-modified; wherein each set of base-specifically 
terminated fragments has a sufficient mass difference from the other sets of base 
specifically terminated fragments so as to be unique; and wherein the molecular 
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weight values of the nested fragments of two or more sets of unseparated base- 
specifically terminated fragments are determined concurrently. 

40. The method according to claim 39, wherein the reversible linkage is 
5 aphotocleavable bond. 

41. The method according to claim 39 wherein the base-specifically terminated 
fragments are cleaved from the solid support prior to mass spectrometry 
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FICURE 2 

Figure 2A: Sanger DNA Sequencing Reaction Products with 
ddT Termination of Hypothetical DKA (so-ner) . 



5 



dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT 
-dT 

-dTAACGGT 

-dTAACGGTCAT 

- dTAAC G G TCATT 

-dTAACGGTCATTACGGCCAT 

-dTAAC G GTCATTACG GCCATT 

-dTAACGGTCATTACGGCCATTGACT 

— dTAACG GTCATTACG G CC ATT G ACTGT 

— dTAAC GGTCATTACGG CCATTGACTGTAGG AC CT 

—dTAACG GTCATTACG GCCATT GACTGTAGG AC CTGCAT 

-dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATT 

-dTAACG GTCATTAC G G CC ATT GACTGTAGG AC CTGCATTACAT 

— dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACT 
-dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT 



226.23 
2X04.45 
30X1.04 
3315.24 
S771.82 
6076.02 
7311.82- 
7945.22 
10112.63 
11348.43 
11652.62 
12B72.42 
14108.22 
15344.02 



Figure 2B: Idealized MALDX-tor Mass Spectrum Showing the 
ion current vs. time/molecular weight" . 
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5 • -dTp 
3 • -ddT 
dTp 
dAp 
dCp 
dGp 



nolecular weight increments used: 



Nucleotides 
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225.22.^22 
304 . 19618 
313.209S4 
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Figure 3A: Sanger DHA Sequencing Reaction Products with 
ddA Termination of Hypothetical DHA (SO-mer) . 
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1 -dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT 




5' 


r -dTA 


S30.43 


5 1 


-dTAA 


843.64 


5< 


— d TAACGGTCA 


2697,83 


£< 


— dTAACG GTCATTA 


3 619.43 


s< 


' • dTAACG GTCATTACGG CCA 


5458.61 


5< 


1 -dTAACGGTCATTACGGCCATTGA 


6709.42 


5< 


1 — dTAACGGTCATTACGGCCATTGACTGTA 


8249.42 


S« 


1 — dTAACGGTCATTACGGCCATTGACTGTAGGA 


9221.05 


5 


1 -dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCA 


11035.22 


S 


* — dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTA 


11956.82 


5 


1 -dTAACGGTCATTACCGCCATTGACTGTAGGACCTGCATTACA 


12S59.21 


5 


1 — dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGA 


13505.83 


5 


1 -dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTA 


14412.42 



Figure 3B: Idealized HALDI-TOF Mass Spectrum Showing the 
"ion current vs. time/molecular weight". 
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Figure 4A; Sanger DNA Sequencing Reaction Products with 
ddG Termination of Hypothetical DHA (50-mer) . 

S ■ -dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT 

S • — dTAACG 1471.05 

5 * — dTAACGG 1800.25 

S • -dTAACGGTCATTACG 4 2 4 6.84 

5 • — d T AAC G GT C ATT ACG G 4 57 6 . 05 

5 • — dT AAC G GT CATT AC G G C C ATT G 6405 .23 

5 1 «- dTAACG GTCATTACGGCCATTGACTG 7 6 41.03 

5 ■ -dTAACGGTCATTACGGCCATTGACTGTAG 8587.64 

S • — dT AAC G GTC ATT ACG GCC ATTG ACTGT AG G 8 916.85 

5 • -dTAACGGTCATTACGGCCATTGACTGTAGG ACCTG 10 4 41.84 

5 • -dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATG 132 01. 63 

5 • -dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAG 14 75 0. 64 

Figure 4B: Idealized KALDI-T0F Kass Spectrun Shoving the 
••ion current vs. time/molecular weight". 
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Figure SA: Sanger DNA Sequencing Reaction Products vitli 
ddC Termination of Hypothetical DHA (50-mer) . 
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-dTAACGGTCATTACGGCCATXGACTGTAGGACCTGCATTACATGACTAGCT 
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-dTAACGGTC 


2393*63 
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-dTAACGGTCATTAC 


3917.63 


5« 


-dTAACCGTCATTACGGC 


4865.23 


5« 


-dTAACGGTCATTACGGCC 


5154.42 


5« 


- dT AACGGTCATTAC GGC CATTGAC 


7007.62 


S« 


-dTAACGGTCATTACGGCCATTGACTCTAGGAC 


9S19.25 


5 f 


-dTAACGGTCATTACGGCCATTGACTGTAGGACC 


980B.43 


S* 


-dTAACGGTCATTACGGCCATTGACTGTAGGACCTCC 


10731. 02 


S' 


— dTAACG GTCATTACG GCCATT G ACTGTAGGACCTG CATTAC 


12255.02 


S» 


-dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGAC 


13804.02 


s • 


-dTAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGC 


15039.82 



Figure 5B: Idealized MALDI-TOF Kass Spectrua Showing the 
"ion current vs. tine/molecular weight". 
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FICURE 6 

Figure 6: series of Molecular Weights of the Hypothetical 
DNA (50-ner) after Sanger Sequencing und 
Termination with either ddTFT, ddAXP, ddGTP or 
ddCTP respectively. 
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FIGURE 7 
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FIGURE 8 
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FIGURE 9 
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DNA SEQUENCING BY MASS SPECTROMETRY 

Related Applications 

This application is a continuation-in-part of U.S. Application Serial Number 
08/617,010, which j^acontinuation-in-part of U.S. Application Serial Number 
08/178,216, which issued as U.S. Patent No. 5,547,835, and which itself is a 
continuation-in-part of US. Application Serial Number 08/001,323 filed January 7, 1993, 
which is now abandoned. The contents of all related applications are incorporated herein 
by reference. 

Background of the Invention 

Since the genetic information is represented by the sequence of the four 
DNA building blocks deoxyadenosine- (dpA), deoxyguanosine- (dpG), deoxycytidine- 
(dpC) and deoxythymidine-5'-phosphate (dpT), DNA sequencing is one of the most 
fundamental technologies in molecular biology and the life sciences in general. The ease 
and the rate by which DNA sequences can be obtained greatly affects related 
technologies such as development and production of new therapeutic agents and new and 
useful varieties of plants and microorganisms via recombinant DNA technology. In 
particular, unraveling the DNA sequence Jielps in understanding human pathological 
conditions including genetic disorders, cancer and AIDS. In some cases, very subtle 
differences such as a one nucleotide deletion, addition or substitution can create serious, 
in some cases even fatal, consequences. Recently, DNA sequencing has become the core 
technology of the Human Genome Sequencing Project (e.g., J.E. Bishop and M. 
Waldholz, 1 991. Genome: The Storv of the Most Astonishing Scientific Adventure of 
Our Time - The Attempt to Map All the Genes in the Human Body, Simon & Schuster, 
New York). Knowledge of the complete human genome DNA sequence will certainly 
help to understand, to diagnose, to prevent and to treat human diseases. To be able to 
tackle successfully the determination of the approximately 3 billion base pairs of the 
human genome in a reasonable time frame and in an economical way, rapid, reliable. 



WO 97/37041 PCT/US97/04394 

-2- 

sensitive and inexpensive methods need to be developed, which also offer the possibility 
of automation. The present invention provides such a technology. 

Recent reviews of today's methods together with future directions and 
trends are given by Barrell (The EASES Journal 5, 40-45 (1991)), and Trainor (AnaL 
5 ChfiiiL 62, 41 8-26 (1990)). 

Currently, DNA sequencing is performed by either the chemical degradation 
method of Maxam and Gilbert (Methods in Enzvmnlngv 499-560 (1980)) or the 
enzymatic dideoxynucleotide termination method of Sanger et at (Proc Nat! AcaH- Sri 
USA 24, 5463-67 (1977)). In the chemical method, base specific modifications result in 
1 0 a base specific cleavage of the radioactive or fluorescently labeled DNA fragment With 
the four separate base specific cleavage reactions, four sets of nested fragments are 
produced which are separated according to length by polyacrylarnide gel electrophoresis 
(PAGE). After autoradiography, the sequence can be read directly since each band 
(fragment) in the gel originates from a base specific cleavage event. Thus, the fragment 
1 5 lengths in the four "ladders" directly translate into a specific position in the DNA 
sequence. 

In the enzymatic chain termination method, the four base specific sets of 
DNA fragments are formed by starting with a primer/template system elongating the 
primer into the unknown DNA sequence area and thereby copying the template and 

20 synthesizing a complementary strand by DNA polymerases, such as Klenow fragment of 
K coli DNA polymerase I, a DNA polymerase from Thermus aquaticus. Taq DNA 
polymerase, or a modified T7 DNA polymerase, Sequenase (Tabor et al. , Proc Nat! 

Acad, Sci, USA S4, 4767-477 1 ( 1 987)), in the presence of chain-terminating reagents. 

Here, the chain-terminating event is achieved by incorporating into the four separate 

25 reaction mixtures in addition to the four normal deoxynucleoside triphosphates, dATP, 
dGTP, dTTP and dCTP, only one of the chain-terminating dideoxynucleoside 
triphosphates, ddATP, ddGTP, ddTTP or ddCTP, respectively, in a limiting small 
concentration. The four sets of resulting fragments produce, after electrophoresis, four 
base specific ladders from which the DNA sequence can be determined. 

30 A recent modification of the Sanger sequencing strategy involves the 

degradation of phosphorothioate-containing DNA fragments obtained by using alpha-thio 
dNTP instead of the normally used ddNTPs during the primer extension reaction 
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mediated by DNA polymerase (Labeh et aL , DNA L 173-177 (1986); Amersham, PCT- 
Application GB86/00349; Eckstein et aL. Nucleic Acids Res. 16, 9947 (1988)). Here, 
the four sets of base-specific sequencing ladders are obtained by limited digestion with 
exonudease III or snake venom phosphodiesterase, subsequent separation on PAGE and 
5 visualization by radioisotopic labeling of either the primer or one of the dNTPs. In a 

further modification, the base-specific cleavage is achieved by alkylating the sulphur atom 
in the modified phosphodiester bond followed by a heat treatment (Max-Pianck- 
Geselischafl^DE 3930312 Al). Both methods can be combined with the amplification of 
the DNA via the Polymerase Chain Reaction (PCR). 

10 On the upfront end, the DNA to be sequenced has to be fragmented into 

sequencable pieces of currently not more than 500 to 1000 nucleotides. Starting from a 
genome, this is a multi-step process involving cloning and subcloning steps using different 
and appropriate cloning vectors such as YAC, cosmids, plasmids and Ml 3 vectors 
(Sambrook etal % Molecular Cloning: A Laboratory Manual Cold Spring Harbor 

15 Laboratory Press, 1989). Finally, for Sanger sequencing, the fragments of about 500 to 
1000 base pairs are integrated into a specific restriction site of the replicative form 1 (RF 
I) of a derivative of the M13 bacteriophage (Vieria and Messing, Gene 19 T 259 (1982)) 
and then the double-stranded form is transformed to the single-stranded circular form to 
serve as a template for the Sanger sequencing process having a binding site for a 

20 universal primer obtained by chemical DNA synthesis (Sinha, Biernat, McManus and 

K6ster, Nucleic Acids Res 12 4539-57 (1984); U.S. Patent No. 4725677 upstream of 
the restriction site into which the^mknown DNA fragment has been inserted. Under 
specific conditions, unknown DMA-sequences integrated into supercoiled double- 
stranded plasmid DNA can be sequenced directly by the Sanger method (Chen and 

25 Seeburg,DMA& 165-170 f!98SUand Lim et al., Gene Anal Techn S 32-3QnQ88^ 
and, with the Polymerase Chain Reaction (PCR) (PCR Proto cols A Guide to Methods 
and Applications Innis et aL, editors, Academic Press, San Diego (1990)) cloning or 
subcloning steps could be omitted by directly sequencing off chromosomal DNA by first 
amplifying the DNA segment by PCR and then applying the Sanger sequencing method 

30 (Innis et al. % Proc Natl Acad Sci USA 85 9436-9440 (1988)). In this case, however, 
the DNA sequence in the interested region most be known at least to the extent to bind a 
sequencing primer. 
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In order to be able to read the sequence from PAGE, detectable labels have 
to be used in either the primer (very often at the S'-end) or in one of the deoxynudeoside 
triphosphates, dNTP. Using radioisotopes such as 32 P, 33 P, or 35 S is still the most 
frequently used technique. After PAGE, the gels are exposed to X-ray films and silver 
grain exposure is analyzed. The use of radioisotopic labeling creates several problems. 
Most labels useful for autoradiographic detection of sequencing fragements have 
relatively short half-lives which can limit the useful time of the labels. The emission high 
.energy beta radiation, particularly from 32 P, can lead to breakdown ofthe products via 
radiolysis so that the sample should be used very quickly after labeling. In addition, high 
energy radiation can also cause a deterioration of band sharpness by scattering. Some of 
these problems can be reduced by using the less energetic isotopes such as 33 P or 35 S 
(see, e.g.. Ornstein et al., Bivtechniqu^ 2, 476 (1 985)). Here, however, longer exposure 
times have to be tolerated. Above all, the use of radioisotopes poses significant health 
risks to the experimentalist and, in heavy sequencing projects, decontamination and 
1 5 handling the radioactive waste are other severe problems and burdens. 

In response to the above mentioned problems related to the use of 
radioactive labels, non-radioactive labeling techniques have been explored and, in recent 
years, integrated into partly automated DNA sequencing procedures. All these 
improvements utilize the Sanger.sequendng strategy. The fluorescent label can be tagged 
20 to the primer (Smith et at , Nature 221, 674-679 ( 1 986) and EPO Patent No. 

87300998.9; Du Pont De Nemours EPO Application No. 0359225; Ansorge et al. L 
B i OChem Biophvs , Method?? 11, 325-32 (1986)) or to the chain-terminating 
dideoxynucloside triphosphates (Prober et aL Science 238 336-41 (1987); Applied 
Biosystems, PCT Application WO 91/05060). Based on either labeling the primer or the 
ddNTP, systems have been developed by Applied Biosystems (Smith et al.. Science 225, 
G89 (1987); U.S. Patent Nos. 570973 and 689013), Du Pont De Nemours (Prober et al. 
Science 213, 336-341 (1987); U.S. Patents Nos. 881372 and 57566), Pharmacia-LKB 
(Ansorge et al. Nucleic Acids Rffi 11, 4593-4602 (1987) and EMBL Patent Application 
DE P3724442 and P3805808. 1) and Hitachi (JP 1-90844 and DE 401 1991 Al). A 
somewhat similar approach was developed by Brumbaugh et al. ( Pmc Nari gq USA 
SS. 5610-14 (1988) and U.S. Patent No. 4.729.947). An improved method for the Du 
Pont system using two electrophoretic lanes with two different specific labels per lane is 



25 



30 
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described (PCT Application WO92/02635). A different approach uses fluorescently 
labeled avidin and biotin labeled primers. Here, the sequencing ladders ending with biotin 
are reacted during electrophoresis with the labeled avidin which results in the detection of 
the individual sequencing bands (Brumbaugh et al, U.S. Patent No. 594676). 
5 More recently even more sensitive non-radioactive labeling techniques for 

DN A using chemiluminescence triggerable and amplifyable by enzymes have been 
developed (Beck, OlCeefe, CouU and K6ster, Nucleic Acids Res IX 51 15-5123 (1989) 
and Beck and Koster, Anal Chem. 62, 2258-2270 (1990)). These labeling methods were _ 
combined with multiplex DNA sequencing (Church etaL SidfiQCfi 240, 185-188 (1988) to 

1 0 provide for a strategy aimed at high throughput DNA sequencing (Koster et aL , 

Nucleic Adds Res Symposium Ser No. 24. 318-321 (1991), University of Utah, PCT 
Application No. WO 90/15883); this strategy still suffers from the disadvantage of being 
very laborious and difficult to automate. 

In an attempt to simplify DNA sequencing, solid supports have been 

15 introduced. In most cases published so far, the template strand for sequencing (with or 
without PCR amplification) is immobilized on a solid support most frequently utilizing 
the strong biotin-avidin/streptavidin interaction (Orion- Yhtyma Oy, U.S. Patent No. 
277643; M. Uhlen et aL Nucleic Acids Res . 1£, 3025-38 (1988); Cemu Bioteknik, PCT 
Application No. WO 89/09282 and Medical Research Council, GB, PCT Application No. 

20 WO 92/03575). The primer extension products synthesized on the immobilized template 
strand are purified of enzymes, other sequencing reagents and by-products by a washing 
step and then released under denaturing conditions by loosing the hydrogen bonds 
between the Watson-Crick base pairs and subjected to PAGE separation. In a different 
approach, the primer extension products (not the template) from a DNA sequencing 

25 reaction are bound to a solid support via biotin/avidin (Du Pont De Nemours, PCT 
Application WO 91/1 1533). In contrast to the above mentioned methods, here, the 
interaction between biotin and avidin is overcome by employing denaturing conditions 
(formamide/EDTA) to release the primer extension products of the sequencing reaction 
from the solid support for PAGE separation. As solid supports, beads, (e.g., magnetic 

30 beads (Dynabeads) and Sepharose beads), filters, capillaries, plastic dipsticks (e.g., 
polystyrene strips) and micro titer wells are being proposed. 
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All methods discussed so far have one central step in common: 
polyacrylamide gel electrophoresis (PAGE). In many instances, this represents a major 
drawback and limitation for each of these methods. Preparing a homogeneous gel by 
polymerization, loading of the samples, the electrophoresis itself; detection of the 
5 sequence pattern (e.g., by autoradiography), removing the gel and cleaning the glass 
plates to prepare another gel are very laborious and time-consuming procedures. 
Moreover, the whole process is error-prone, difficult to automate, and, in order to 
improve .reproducibility and reliability, highly trained and skilled personnel are required" 
farfhe case of radioactive labeling, autoradiography itself can consume from hours to 

10 days. In the case of fluorescent labeling, at least the detection of the sequencing bands is 
being performed automatically when using the laser-scanning devices integrated into 
commercial available DNA sequencers. One problem related to the fluorescent labeling is 
the influence of the four different base-specific fluorescent tags on the mobility of the 
fragments during electrophoresis and a possible overlap in the spectral bandwidth of the 

1 5 four specific dyes reducing the discriminating power between neighboring bands, hence, 
increasing the probability of sequence ambiguities. Artifacts are also produced by base- 
specific interactions with the polyacrylamide gel matrix (Frank and K6ster, Nucleic 
Acids Res . £, 2069 (1979)) and by the formation of secondary structures which result in 
-band compressions" and hence do not allow one to read the sequence. This problem 

20 has, in part, been overcome by using 7-deazadeoxyguanosine triphosphates (Barr et ai % 
JBimfiChnioufiS 4, 428 (1986)). However, the reasons for some artifacts and conspicuous 
ban ds are still under investigation and need further improvement of the gel 
— elettrophoretic procedure. 

A recent innovation in electrophoresis is capillary zone electrophoresis 

25 (CZE) (Jorgenson et aL, J Chromatography 2^2, 337 (1986); Gesteland et aL, 

Nucleic Acids RftS 1&, 1415-1419 (1990)) which, compared to slab gel electrophoresis 
(PAGE), significantly increases the resolution of the separation, reduces the time for an 
electrophoretic run and allows the analysis of very small samples. Here, however, other 
problems arise due to the miniaturization of the whole system such as wall effects and the 

30 necessity of highly sensitive on-line detection methods. Compared to PAGE, another 

drawback is created by the feet that CZE is only a "one-lane" process, whereas in PAGE 
samples in multiple lanes can be electrophoresed simultaneously. 
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Due to the severe limitations and problems related to having PAGE as an 
integral and central part in the standard DNA sequencing protocol, several methods have 
been proposed to do DNA sequencing without an electrophoretic step. One approach 
calls for hybridization or fragmentation sequencing (Bains, Biotechnology jO, 757-58 
5 (1992) and Mirzabekov et al y FEB S Letters 256, 1 18-122 (1989)) utilizing the specific 
hybridization of known short oligonucleotides (e.g., octadeoxynucleotides which gives 
65,536 different sequences) to a complementary DNA sequence. Positive hybridization 
reveals a short stretch of the unknown sequence. Repeating this process by performing 
hybridizations with all possible octadeoxynucleotides should theoretically determine the 

1 0 sequence. In a completely different approach, rapid sequencing of DNA is done by 
unilaterally degrading one single, immobilized DNA fragment by an exonuclease in a 
moving flow stream and detecting the cleaved nucleotides by their specific fluorescent tag 
via laser excitation (Jett et aL t J Biomolecular Structure & Dynamics 2, 301*309, 
(1989); United States Department of Energy, PCT Application No. WO 89/03432). In 

1 5 another system proposed by Hyman ( Anal Biochem. 124, 423-436 (1988)), the 

pyrophosphate generated when the correct nucleotide is attached to the growing chain on 
a primer-template system is used to determine the DNA sequence. The enzymes used 
and the DNA are held in place by solid phases (DEAE-Sepharose and Sepharose) either 
by ionic interactions or by covalent attachment. In a continuous flow-through system, the 

20 amount of pyrophosphate is determinedsvia bioluminescence (luciferase). A synthesis 
approach to DNA sequencing is also-used- by Tsien et ai (PCT Application No. WO 
91/06678). Here, the incoming dNTP^s are protected at the 3 -end by various blocking 
groups such as acetyl or phosphate groups and are removed before the next elongation 
step, which makes this process very slow compared to standard sequencing methods. 

25 The template DNA is immobilized on a polymer support. To detect incorporation, a 
fluorescent or radioactive label is additionally incorporated into the modified dNTP's. 
The same patent application also describes an apparatus designed to automate the 
process. 

Mass spectrometry, in general, provides a means of "weighing" individual 
30 molecules by ionizing the molecules in vacuo and making them "fly" by volatilization. 
Under the influence of combinations of electric and magnetic fields, the ions follow 
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trajectories depending on their individual mass (m) and charge (z). In the range of 
molecules with low molecular weight, mass spectrometry has long been part of the 
routine physical-organic repertoire for analysis and characterization of organic molecules 
by the determination of the mass of the parent molecular ion. In addition, by arranging 
5 collisions of this parent molecular ion with other particles (e.g., argon atoms), the 

molecular ion is fragmented forming secondary ions by the so-called collision induced 
dissociation (CID). The fragmentation pattern/pathway very often allows the derivation 
of detailed structural information. Many applications of mass spectrometric methods in 
the-known in the art, particularly in biosciences, and can be found summarized in 
10 Methods in Rnzymology , Vol. 193: "Mass Spectrometry" (J. A. McCloskey, editor), 
1990, Academic Press, New York. 

Due to the apparent analytical advantages of mass spectrometry in 
providing high detection sensitivity, accuracy of mass measurements, detailed structural 
information by CID in conjunction with an MS/MS configuration and speed, as well as 
1 5 on-line data transfer to a computer, there has been considerable interest in the use of 
mass spectrometry for the structural analysis of nucleic acids. Recent reviews 
summarizing this field include K. H. Schram, "Mass Spectrometry of Nucleic Acid 
Components, Biomedical Applications of Mass Spectrometry" 24, 203-287 (1990); and 
P.F. Crain, "Mass Spectrometric Techniques in Nucleic Acid Research," Mass 
20 Spectrometry Rfivirw? 2, 505-554 (1990). The biggest hurdle to applying mass 
= — spectrometry to nucleic acids is the difficulty of volatilizing these very polar biopolymers. 
- • Therefore, "sequencing" has been limited to low molecular weight synthetic 

oligonucleotides by determining the mass of the parent molecular ion and through this, 
confirming the already known sequence, or alternatively, confirming the known sequence 
25 through the generation of secondary ions (fragment ions) via CED in an MS/MS 

configuration utilizing, in particular, for the ionization and volatilization, the method of 
fast atomic bombardment (FAB mass spectrometry) or plasma desorption (PD mass 
spectrometry). As an example, the application of FAB to the analysis of protected 
dimeric blocks for chemical synthesis of oligodeoxynucleotides has been described 
30 (K6ster el al. Biomedical Environmental Mass Spftrtrntrwry 14, n i-i 16 (1987)). 
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Two more recent ionization/desorption techniques are electrospray/ionspray 
(ES) and matrix-assisted laser desorption/ionization (MALDI). ES mass spectrometry 
has been introduced by Fenn et aL (J. Phys. Chem. £SL 4451-59 (1 984); PCT Application 
No. WO 90/14148) and current applications are summarized in recent review articles 
(R.D. Smith et aL 9 Anal, Chem £2, 882-89 (1990) and B. Ardrey, Electrospray Mass 
Spectrometry, Spectroscopy fiurppc 4, 10-18 (1992)). The molecular weights of the 
tetradecanucleotide d(CATGCCATGGCATG) (SEQ ID NO: 1) (Covey et aL "The 
Determination of Protein, Oligonucleotide and Peptide Molecular Weights by lonspray 
Mass Spectrometry," Rapid Communications in Mass Spectrometry 2, 249-256 (1988)), 
of the 21-mer d(AAATTGTGCACATCCTGCAGC) (SEQ ID NO:2) and without giving 
details of that of a tRNA with 76 nucleotides (Methods in Enzvmologv 193. "Mass 
Spectrometry" (McCioskey, editor), p. 425, 1990, Academic Press, New York) have 
been published. As a mass analyzer, a quadrupole is most frequently used. The 
determination of molecular weights in femtomole amounts of sample is very accurate due 
to the presence of multiple ion peaks which all could be used for the mass calculation. 

MALDI mass spectrometry, in contrast, can be particularly attractive when 
a time-of-flight (TOF) configuration is used as a mass analyzer. The MALDI-TOF mass 
spectrometry has been introduced by Hillenkamp et aL ("Matrix Assisted UV-Laser 
Desorption/ionization: A New Approach to Mass Spectrometry of Large Biomolecules," 
Biological Mass Spectrometry (Buiiingame and McCioskey, editors), Elsevier Science 
Publishers, Amsterdam, pp. 49-60, 1990.) Since^n most cases, no multiple molecular 
ion peaks are produced with this technique, the mass spectra, in principle, look simpler 
compared to ES mass spectrometry. Although DNA molecules up to a molecular weight 
of 410,000 daltons could be desorbed and volatilized (Williams et a/., "Volatilization of 
High Molecular Weight DNA by Pulsed Laser Ablation of Frozen Aqueous Solutions," 
SfiifinCfi, 246, 1585-87 (1989)), this technique has so far only been used to determine the 
molecular weights of relatively small oligonucleotides of known sequence, e.g., 
oligothyrnidylic acids up to 1 8 nucleotides (Huth-Fehre et aL, "Matrix- Assisted Laser 
Desorption Mass Spectrometry of Oligodeoxythymidylic Acids," 
Rapid Communications in Mass Spectrometry, & 209-13 (1992)) and a double-stranded 
DNA of 28 base pairs (Williams et aL, "Time-of-Fiight Mass Spectrometry of Nucleic 
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Acids by Laser Ablation and Ionization from a Frozen Aqueous Matrix," Rapid 
Communications in Maw ISpmromrtry 4. 348-351 (1990)). in one publication (Huth- 
Fehre et aL- 1992 . supra), it was shown that a mixture of all the oligothymidyUc acids 
from n= 1 2 to n= 1 8 nucleotides could be resolved. 

In U.S. Patent No. 5.064,754, RNA transcripts extended by DNA both of 
which are complementary to the DNA to be sequenced are prepared by incorporating 
NTP's, dNTp!s anoVasterminating nucleotides, ddNTFs which are substituted at theS'- 
position of the^sugaynoiety with one or a combination of the isotopes 12 C. 13 C, 14 C, 
H; H, H, O, O and O. The polynucleotides obtained are degraded to 3'- 
nucleotides, cleaved at the N-glycosidic linkage and the isotopically labeled 5'- 
fiinctionality removed by periodate oxidation and the resulting formaldehyde species 
determined by mass spectrometry. A specific combination of isotopes serves to 
discriminate base-specifically between internal nucleotides originating from the 
incorporation of NTP's and dNTP's and terminal nucleotides caused by linking ddNTP's 
15 to the end of the polynucleotide chain. A series of RNA/DNA fragments is produced, 
and in one embodiment, separated by electrophoresis, and, with the aid of the so-called 
matrix method of analysis, the sequence is deduced. 

In Japanese Patent No. 59-131909, an instrument is described which detects 
nucleic acid fragments separated either by electrophoresis, liquid chromatography or high 
speed gel filtration. Mass spectrometric detection is achieved by incorporating into the 
nucleifcacids atoms which normally do not occur in DNA such as S, Br. I or Ag, Au, Pt, 
OsrHg, The method, however, is not applied to sequencing of DNA using the Sanger 
method. In particular, it does not propose a base-specific correlation of such elements to 
an individual ddNTP. 

PCT Application No. WO 89/12694 (Brennan et aL, Proc SPTF.lnt s or 
Qtt F '"g 1206 - Ofcw Tfflhnol Cvtom, Mol Biol ), PP- 60-77 (1990); and Brennan. 
U.S. Patent No. 5,003,059) employs the Sanger methodology for DNA sequencing by 
using a combination of either the four stable isotopes 32 S, 33 S, 34 S. 36 S or 35 CI 3? C1 
Br. Br to specifically label the chaiiHterminating ddNTP's. The sulfur isotopes can 
be located either in the base or at the alpha-position of the triphosphate moiety whereas 
the halogen isotopes are located either at the base or at the 3'-posirion of the sugar ring 
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The sequencing reaction mixtures are separated by an electrophoretic technique such as 
CZE, transferred to a combustion unit in which the sulfur isotopes of the incorporated 
ddNTP's are transformed at about 900°C in an oxygen atmosphere. The S0 2 generated 
with masses of 64, 65, 66 or 68 is determined on-line by mass spectrometry using, e.g., as 
5 mass analyzer, a quadnipole with a single ion-multiplier to detect the ion current. 

A similar approaches proposed in U.S. Patent No. 5,002,868 (Jacobson et 
ai, Pro. SPIE^Illt SOC, Opt. Eng. 1435, fOpt Method * IJltrasensitiv* Detect Ann! 
Tech. Appl.X 26-35 (1991)) using Sanger sequencing with four ddNTP's specifically 
substituted at the alpha-position of the triphosphate moiety with one of the four stable 

10 sulfur isotopes as described above and subsequent separation of the four sets of nested 
sequences by tube gel electrophoresis. The only difference is the use of resonance 
ionization spectroscopy (RIS) in conjunction with a magnetic sector mass analyzer as 
disclosed in U.S. Patent No. 4,442,354 to detect the sulfur isotopes corresponding to the 
specific nucleotide terminators, and by this, allowing the assignment of the DNA 

1 5 sequence. 

EPO Patent Applications No. 0360676 Al and 0360677 Al also describe 

Sanger sequencing using stable isotope substitutions in the ddNTP's such as D, 13 C 

15 vi 17 ^ 32 0 33 0 34 36 0 19 35^, 37^, 79 81 127 

N, O, O, S, S, S, S, F, CI, CI, Br, Br and I or functional 

groups such as CF3 or Si(CH 3 >j at the base, the sugar or the alpha position of the 

20 triphosphate moiety according to chemical functionality. The Sanger sequencing reaction 

mixtures are separated by tube gel electrophoresis. The effluent is converted into an 

aerosol by the electrospray/thermospray nebulizer method and then atomized and ionized 

by a hot plasma (7000 to 8000°K) and analyzed by a simple mass analyzer. An 

instrument is proposed which enables one to automate the analysis of the Sanger 

25 sequencing reaction mixture consisting of tube electrophoresis, a nebulizer and a mass 

analyzer. 

The application of mass spectrometry to perform DNA sequencing by the 
hybridization/fragment method (see above) has been recently suggested (Bains, "DNA 
Sequencing by Mass Spectrometry: Outline of a Potential Future Application/ 
30 Chimicaogg i 9 13-16 (1991)). 
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Summarv of the Invention 

The invention describes a new method to sequence DNA The 
improvements over the existing DNA sequencing technologies include high speed, high 
throughput, no required electrophoresis (and, thus, no gel reading artifacts due to the 
5 complete absence of an electrophoretic step), and no costly reagents involving various 
substitutions with stable isotopes. The invention utilizes the Sanger sequencing strategy 
and assembles the sequence information by analysis of the nested fragments obtained by 
base-specific chain termination via their different molecular masses using mass 
spectrometry, for example, MALDI or ES mass spectrometry. A further increase in 
10 throughput can be obtained by introducing mass modifications in the oligonucleotide 
primer, the chain-terminating nucleoside triphosphates and/or the chain-elongating 
nucleoside triphosphates, as well as using integrated tog sequences which allow 
multiplexing by hybridization of tag specific probes with mass differentiated molecular 
weights. 

15 

Brief Description of the QGLZRES 

FIGURE 1 is a representation of a process to generate the samples to be 
analyzed by mass spectrometry. This process entails insertion of a DNA fragment of 
unknown sequence into a cloning vector such as derivatives of Ml 3, pUC or phagemids; 

20 transforming the double-stranded form into the single-stranded form; performing the four 
Sanger sequencing reactions; linking the base-specifically terminated nested fragment 
family temporarily to a solid support; removing by a washing step all by-products; 
conditioning the nested DNA or RNA fragments by, for example, cation-ion exchange or 
modification reagent and presenting the immobilized nested fragments either directly to 

25 mass spectrometric analysis or cleaving the purified fragment family off the support and 
evaporating the cleavage reagent. 

FIGURE 2A shows the Sanger sequencing products using ddTTP as 
terminating deoxynudeoside triphosphate of a hypothetical DNA fragment of 50 
nucleotides (SEQ ID NO:3) in length with approximately equally balanced base 

30 composition. The molecular masses of the various chain terminated fragments are given. 
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FIGURE 2B shows an idealized mass spectrum of such a DNA fragment 

mixture. 

FIGURES 3 A and 3B show, in analogy to FIGURES 2A and 2B, data for 
the same model sequence (SEQ ID NO:3) with ddATP as chain terminator. 
5 FIGURES 4A and 4B show data, analogous to FIGURES 2A and 2B 

when ddGTP is used as a chain terminator for the same model sequence (SEQ ID NO:3) 

FIGURES SA and SB illustrate the results obtained where chain 
termination is performed with ddCTP as a chain terminator, in a similar way as shown in 
FIGURES 2A and 2B for the same model sequence (SEQ ID NO:3). 
10 FIGURE 6 summarizes the results of FIGURES 2 A to 5B, showing the 

correlation of molecular weights of the nested four fragment families to the DNA 
sequence (SEQ ID NO:3). 

FIGURE 7 illustrates the general structure of mass-modified sequencing 
nucleic acid primers or tag sequencing probes for either Sanger DNA or Sanger RN A 
1 5 sequencing. 

FIGURE 8 shows the general structure for the mass-modified 
triphosphates for either Sanger DNA or Sanger RNA sequencing. General formulas of 
the chain-elongating and the chain-terminating nucleoside triphosphates are 
demonstrated. 

20 FIGURE 9 outlines various linking chemistries (X) with either 

— polyethylene glycol or terminally monoalkylated polyethylene glycol (R) as an example. 
- FIGURE 10 illustrates similar linking chemistries as shown in FIGURE 8 

and depicts various mass modifying moieties (R). 

FIGURE 1 1 outlines how multiplex mass spectrometric sequencing can 
25 work using the mass-modified nucleic acid primer (UP). 

FIGURE 12 shows the process of multiplex mass spectrometric 
sequencing employing mass-modified chain-elongating and/or terminating nucleoside 
triphosphates. 

FIGURE 13 shows multiplex mass spectrometric sequencing by involving 
30 the hybridization of mass-modified tag sequence specific probes. 
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FIGURE 14 shows a MALDI-TOF spectrum of a mixture of 
oligothymidylic acids, d(pT) 12- 18* 

FIGURE 15 shows a superposition of MALDI-TOF spectra of the 50-mer 

d(TAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT) 
(SEQ ID NO:3) (500 finol) and dT(pdT) 99 (500 find). 

FIGURE 16 shows the MALDI-TOF spectra of all 13 DNA sequences 
representing the nested dT-terminated fragments of the Sanger DNA sequencing 
simulation of Figure 2, 500 finol each. 

FIGURE 17 shows the superposition of the spectra of FIGURE 16. The 
two panels show two different scales and the spectra analyzed at that scale 

FIGURE 18 shows the superimposed MALDI-TOF spectra from MALDI- 
MS analysis of mass-modified oligonucleotides as described in Example 2L 

FIGURE 19 illustrates various linking chemistries between the solid 
support (P) and the nucleic acid primer (NA) through a strong electrostatic interaction. 

FIGURE 20 illustrates various linking chemistries between the solid 
support (P) and the nucleic acid primer (NA) through a charge transfer complex of a 
charge transfer acceptor (A) and a charge transfer donor (D). 

FIGURE 21 illustrates various linking chemistries between the solid 
support (P) and the nucleic acid primer (NA) through a stable organic radical. 

FIGURE 22 illustrates a possible linking chemistry between the solid 
support (P) and the nucleic acid primer (NA) through Watson-Crick base pairing 

FIGURE 23 illustrates linking the solid support (P) and the nucleic acid 
primer (NA) through a photolytically cleavable bond. 

FIGURE 24 shows the portion of the sequence of pRFcl DNA, which 
was used as template for PCR amplification of unmodified and 7-deazapurine containing 
99-mer and 200-mer nucleic acids as well as the sequences of the 19-mer primers and the 
two 18 -mer reverse primers. 

FIGURE 25 shows the portion of the nucleotide sequence of M13mpl8 
RFI DNA, which was used for PCR amplification of unmodified and 7-deazapurine 
containing 103-mer nucleic acids. Also shown are nucleotide sequences of the 17-mer 
primers used in the PCR. 
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FIGURE 26 shows the result of a polyacrylamide gel electrophoresis of 
PCR products purified and concentrated for MALDI-TOF MS analysis. M: chain length 
marker, lane 1: 7-deazapurine containing 99-mer PCR product, lane 2: unmodified 99- 
mer, lane 3: 7-deazapurine containing 103-mer and lane 4: unmodified 103-mer PCR 
product. 

FIGURE 27 : an autoradiogram of polyacrylamide gel electrophoresis of 
PCRj-eactions carried out with 5'-[ 32 P]-Iabeled primers 1 and 4. Lanes 1 and 2: 
unmodified and 7 -deazapurine modified 103-mer PCR product (53321 and 23520 
counts), lanes 3 and 4: unmodified and 7-deazapurine modified 200-mer (71 123 and 
39582 counts) and lanes 5 and 6: unmodified and 7-deazapurine modified 99-mer 
( 1 73 2 1 6 and 94400 counts). 

FIGURE 28: a) MALDI-TOF mass spectrum of the unmodified 103-mer 
PCR products (sum of twelve single shot spectra). The mean value of the masses 
calculated for the two single strands (3 1768 u and 3 1 759 u) is 3 1 763 u. Mass resolution: 
18. b) MALDI-TOF mass spectrum of 7-deazapurine containing 103-mer PCR product 
(sum of three single shot spectra). The mean value of the masses calculated for the two 
single strands (3 1727 u and 31719 u) is 31723 u. Mass resolution: 67. 

FIGURE 29: a) MALDI-TOF mass spectrum of the unmodified 99-mer 
PCR product (sum of twenty single shot spectra). Values of the masses calculated for the 
two single strands: 30261 u and 30794 u. b) MALDI-TOF mass spectrum of the 7- 
deazapurine containing 99-mer PCR product (sum of twelve single shot spectra). Values 
of the masses calculated for the two single strands: 30224 u and 30750 u. 

FIGURE 30: a) MALDI-TOF mass spectrum of the unmodified 200-mer 
PCR product (sum of 30 single shot spectra). The mean value of the masses calculated 
for the two single strands (6 1 873 u and 6 1 595 u) is 6 1 734 u. Mass resolution: 28. b) 
MALDI-TOF mass spectrum of 7-deazapurine containing 200-mer PCR product (sum of 
30 single shot spectra). The mean value of the masses calculated for the two single 
strands (6 1 772 u and 6 1 5 1 4 u) is 6 1 643 u. Mass resolution: 39. 

FIGURE 31: a) MALDI-TOF mass spectrum of 7-deazapurine containing 
1 00-mer PCR product with ribomodified primers. The mean value of the masses 
calculated for the two single strands (30529 u and 3 1095 u) is 30812 u. b) MALDI-TOF 
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mass spectrum of the PCR-product after hydrolytic primer-cleavage. The mean value of 
the masses calculated for the two single strands (25104 u and 25229 u) is 25167 u. The 
mean value of the cleaved primers (5437 u and 5918 u) is 5677 u. 

FIGURE 32 A-D shows the MALDI-TOF mass spectrum of the four 
sequencing ladders obtained from a 39-mer template (SEQ. ID. No. 13), which was 
immobilized to streptavidin beads via a 3* biotinylation. A 14-mer primer (SEQ. ID. NO 
14) was used in the sequencing. 

FIGURE 33 shows a MALDI-TOF mass spectrum of a solid state 
sequencing of a 78-mer template (SEQ. ID. No. 15), which was immobilized to 
streptavidin beads via a 3' biotinylation. A 18-mer primer (SEQ ID No. 16) and ddGTP 
were used in the sequencing. 

FIGURE 34 shows a scheme in which duplex DNA probes with single- 
stranded overhang capture specific DNA templates and also serve as primers for solid 
state sequencing. 

FIGURE 3 5 A-D shows MALDI-TOF mass spectra obtained from a 5' 
fluorescent labeled 23-mer (SEQ. ID. No. 19) annealed to an 3' biotinylated 1 8-mer 
(SEQ. ED. No. 20), leaving a 5-base overhang, which captured a 1 5-mer template (SEQ. 
ID. No. 21). 

FIGURE 36 shows a stacking flurogram of the same products obtained 
20 from the reaction described in FIGURE 3 5, but run on a conventional DNA sequencer. 

Detailed Description of the Tnvwiti^ 

This invention describes an improved method of sequencing DNA. In 
particular, this invention employs mass spectrometry to analyze the Sanger sequencing 
25 reaction mixtures. 

In Sanger sequencing, four families of chain-terminated fragments are 
obtained. The mass difference per nucleotide addition is 289. 1 9 for dpC, 3 1 3.2 1 for dp A 
329.21 for dpG and 304.2 for dpT, respectively. 

In one embodiment, through the separate determination of the molecular 
30 weights of the four base-specifically terminated fragment families, the DNA sequence can 
be assigned via superposition (e.g., interpolation) of the molecular weight peaks of the 
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four individual experiments. In another embodiment, the molecular weights of the four 

specifically terminated fragment families can be determined simultaneously by MS, either 

by mixing the products of all four reactions run in at least two separate reaction vessels 

(i.e., all run separately, or two together, or three together) or by running one reaction 

5 having all four chain-terminating nucleotides (e.g., a reaction mixture comprising dTTP, 

ddTTP, dATP, ddATP, dCTP, ddCTP, dGTP, ddGTP) in one reaction vessel. By 

simultaneously analyzing all four base-specifically terminated reaction products, the 

molecular weight values have been, in effect, interpolated. Comparison of the mass 

difference measured between fragments with the known masses of each chain-terminating 

10 nucleotide allows the assignment of sequence to be carried out. In some instances, it may 

be desirable to mass modify, as discussed below, the chain-terminating nucleotides so as 

to expand the difference in molecular weight between each nucleotide. It will be apparent 

to those skilled in the art when mass-modification of the chain-terminating nucleotides is 

desirable and can depend, for instance, on the resolving ability of the particular 

1 5 spectrometer employed. By way of example, it may be desirable to produce four chain- 

12 3 1 

terminating nucleotides, ddTTP. ddCTP , ddATP and ddGTP where ddCTP 
2 3 

ddATP and ddGTP have each been mass-modified so as to have molecular weights 
resolvable from one another by the particular spectrometer being used. 

The terms chain-elongating nucleotides and chain-terminating nucleotides 

20 are well known in the art. For DN A chain-elongating nucleotides include 
2 -deoxyribonucleotides and chain-terminating nucleotides include 
2', 3 'ndideoxyribo nucleotides. For RNA, chain-elongating nucleotides include 
ribonucelotides and chain-terminating nucleotides include 3*-deoxyribonucleotides. The 
term nucleotide is also well known in the art. For the purposes of this invention, 

25 nucleotides include nucleoside mono-, di-, and triphosphates. Nucleotides also include 
modified nucleotides such as phosphorothioate nucleotides. 

Since mass spectrometry is a serial method, in contrast to currently used 
slab gel electrophoresis which allows several samples to be processed in parallel, in 
another embodiment of this invention, a further improvement can be achieved by 

30 multiplex mass spectro metric DNA sequencing to allow simultaneous sequencing of more 
than one DNA or RNA fragment. As described in more detail below, the range of about 
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300 mass units between one nucleotide addition can be utilized by employing either mass- 
modified nucleic acid sequencing primers or chain-elongating and/or terminating 
nucleoside triphosphates so as to shift the molecular weight of the base-specifically 
terminated fragments of a particular DNA or RNA species being sequenced in a 
predetermined manner. For the first time, several sequencing reactions can be mass 
spectrometrically analyzed in parallel. In yet another embodiment of this invention, 
multiplex mass spectrometry DNA sequencing can be performed by mass modifying the 
fragment families through specific oligonucleotides (tag probes) which hybridize to 
specific tag sequences within each of the fragment families. In another embodiment, the 
tag probe can be covalently attached to the individual and specific tag sequence prior to 
mass spectrometry. 

Preferred mass spectrometer formats for use in the invention are matrix 
assisted laser desorption ionization (MALDI), electrospray (ES), ion cyclotron resonance 
(ICR) and Fourier Transform. For ES, the samples, dissolved in water or in a volatile 
buffer, are injected either continuously or discontinuously into an atmospheric pressure 
ionization interface (API) and then mass analyzed by a quadrupole The generation of 
multiple ion peaks which can be obtained using ES mass spectrometry can increase the 
accuracy of the mass determination. Even more detailed information on the specific 
structure can be obtained using an MS/MS quadrupole configuration 

In MALDI mass spectrometry, various mass analyzers can be used, e.g., 
magnetic sector/magnetic deflection instmments in single or triple quadrupole mode 
(MS/MS), Fourier transform and time-of-flight (TOF) configurations as is known in the 
art of mass spectrometry. For the desorption/ionization process, numerous matrix/laser 
combinations can be used. Ion-trap and reflectron configurations can also be employed. 

In one embodiment of the invention, the molecular weight values of at 
least two base-specifically terminated fragments are determined concurrently using mass 
spectrometry. The molecular weight values of preferably at least five and more 
preferably at least ten base-specifically terminated fragments are determined by mass 
spectrometry. Also included in the invention are determinations of the molecular weight 
30 values of at least 20 base-specifically terminated fragments and at least 30 base- 
specifically terminated fragments. Further, the nested base-specifically terminated 
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fragments in a specific set can be purified of all reactants and by-products but are not 
separated from one another. The entire set of nested base-specifically terminated 
fragments is analyzed concurrently and the molecular weight values are determined. At 
least two base-specifically terminated fragments are analyzed concurrently by mass 
spectrometry when the fragments are contained in the same sample. 

In general, the overall mass spectrometric DNA sequencing process will 
start with a library of small genomic fragments obtained after first randomly or 
specifically cutting the genomic DNA into large pieces which then, in several subcloning 
steps, are reduced in size and inserted into vectors like derivatives of Ml 3 or pUC (e.g., 
M13mpl8 or M13mpl9) (see FIGURE 1). In a different approach, the fragments 
inserted in vectors, such as Ml 3, are obtained via subcloning starting with a cDNA 
library. In yet another approach, the DNA fragments to be sequenced are generated by 
the polymerase chain reaction (e.g., Higuchi et a/., "A General Method of in vitro 
Preparation and Mutagenesis of DNA Fragments: Study of Protein and DNA 
Interactions, H Nucleic Acids Res , JL& 7351-67 (1988)). As is known in the art, Sanger 
sequencing can start from one nucleic acid primer (UP) binding to the plus-strand or from 
another nucleic acid primer binding to the opposite minus-strand. Thus, either the 
complementary sequence of both strands of a given unknown DNA sequence can be 
obtained (providing for reduction of ambiguity in the sequence determination) or the 
length of the sequence information obtainable from one clone can be extended by 
generating sequence information from both ends of the unknown vector-inserted DNA 
fragment. 

The nucleic acid primer carries, preferentially at the 5-end, a linking 
functionality, L, which can include a spacer of sufficient length and which can interact 
with a suitable functionality, L', on a solid support to form a reversible linkage such as a 
photocleavable bond. Since each of the four Sanger sequencing families starts with a 
nucleic acid primer (L-UP; FIGURE 1) this fragment family can be bound to the solid 
support by reacting with functional groXips, L\ on the surface of a solid support and then 
intensively washed to remove all buffer salts, triphosphates, enzymes, reaction by- 
products, etc. Furthermore, for mass spectrometric analysis, it can be of importance at 
this stage to exchange the cation at the phosphate backbone of the DNA fragments in 
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order to eliminate peak broadening due to a heterogeneity in the cations bound per 
nucleotide unit. Since the L-L' linkage is only of a temporary nature with the purpose to 
capture the nested Sanger DNA or RNA fragments to properly condition them for mass 
spectrometry analysis, there are different chemistries which can serve this purpose. In 
addition to the examples given in which the nested fragments are coupled covalently to 
the solid support, washed, and cleaved off the support for mass spectrometry analysis, 
the temporary linkage ca*be such that it is cleaved under the conditions of mass 
spectrometry, i.e., a photocleavable bond such as a charge transfer complex or a stable 
organic radical. Furthermore, the linkage can be formed with L' being a quaternary 
ammonium group (some examples are given in FIGURE 19). In this case, preferably, the 
surface of the solid support carries negative charges which repel the negatively charged 
nucleic acid backbone and thus facilitates desorption. Desorption will take place either 
by the heat created by the laser pulse and/or, depending on L,' by specific absorption of 
laser energy which is in resonance with the L* chromophore (see, e.g., examples given in 
FIGURE 19). The functionalities, L and L,' can also form a charge transfer complex and 
thereby form the temporary L-U linkage. Various examples for appropriate 
functionalities with either acceptor or donator properties are depicted without limitation 
in FIGURE 20. Since in many cases the "charge-transfer band" can be determined by 
UV/vis spectrometry (see e.g. Organic Charge Transfer nrm^^ by R. Foster, 
Academic Press, 1969), the laser energy can be tuned to the corresponding energy of the 
charge-transfer wavelength and, thus, a specifi<t5dh^rptiQiLofFthe solid support can be 
initiated. Those skilled in the art will recognize that several combinations can serve this 
purpose and that the donor functionality can be either on the solid support or coupled to 
the nested Sanger DNA/RNA fragments or vice versa. 

In yet another approach, the temporary linkage L-L 1 can be generated by 
homolytically forming relatively stable radicals as exemplified in FIGURE 21. In example 
4 of FIGURE 21, a combination of the approaches using charge-transfer complexes and 
stable organic radicals is shown. Here, the nested Sanger DNA/RNA fragments are 
captured via the formation of a charge transfer complex. Under the influence of the laser 
pulse, desorption (as discussed above) as well as ionization will take place at the radical 
position. In the other examples of FIGURE 21 under the influence of the laser pulse, the 
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L-L' linkage will be cleaved and the nested Sanger DNA/RNA fragments desorbed and 
subsequently ionized at the radical position formed. Those skilled in the art will 
recognize that other organic radicals can be selected and that, in relation to the 
dissociation energies needed to homolytically cleave the bond between them, a 
5 corresponding laser wavelength can be selected (see e.g. Reactive Molecules by C 
Wentrup, John Wiley & Sons, 1984). In yet another approach, the nested Sanger 
DNA/RNA fragments are captured via Watson-Crick base pairing to a solid support- 
bound oligonucleotide complementary to either the sequence of the nucleic acid primer or 
the tag oligonucleotide sequence (see FIGURE 22). The duplex formed will be cleaved 

10 under the influence of the laser pulse and desorption can be initiated. The solid support- 
bound base sequence can be presented through natural oligoribo- or 
oligodeoxyribonucleotide as well as analogs (e.g. thio-modified phosphodiester or 
phosphotriester backbone) or employing oligonucleotide mimetics such as PNA analogs 
(see e.g. Nielsen et aL % Science, 254, 1497 (1991)) which render the base sequence less 

15 susceptible to enzymatic degradation and hence increases overall stability of the solid 

support-bound capture base sequence. With appropriate bonds, L-L', a cleavage can be 
obtained directly with a laser tuned to the energy necessary for bond cleavage. Thus, -the 
immobilized nested Sanger fragments can be directly ablated during mass spectrometric 
analysis. 

20 Prior to mass spectrometric analysis, it may be useful to "condition" 

nucleic acid molecules, for example to decrease the laser energy required for volatization, 
to minimize fragmentation or to otherwise increase the sensitivity of mass spectrometeric 
detection. For example, nucleic acids can be "conditioned" by adding positive or 
negative charges, i.e. charge tags (CTs). CTs increase the mass spectrometer detection 

25 sensitivity by increasing the degree of ionization during the mass spectrometric 

(e.g.MALDI) process. A CT can be linked either to the external 3* or 5 1 position or 

internally e.g. at the T position or at the base, e.g. at C-5 in uracil, C-5 methylgroup of 

thymine, C-5 at cytosine, at C 7 or C* of guanine, adenine and hypoxanthine or at the 

phosphate ester moiety. Charge tags, CTs, can function molecules with permanent (i.e. 

30 pH-independent) ionization, such as: 

he 

Mil 
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or molecules which generate a positive charge upon MALDI and which are stabilized by 
delocalization of the positive charge by mesomeric effects in unsaturated and/or aromatic 
systems such as: /^"y^. 



wherein, R, R\ R J = H.OAI (wherein Al= e.g. lower alkyl, methyl, 
ethyl, propyl), NO* CN, COjH, COj active ester, or 
10 halogen; and 

X = -O-, -NH-, -S-. C=0, OCO either in the para or meta 
position. 



15 



20 



For example, the positive charge of a trityl cation is produced during MALDI by the 
removal of a moiety such as: -OR, where R = a lower alkyl, or an anion such as CIO/, 
SbF 6 \BF/and the like. 



In an alternative scheme, the trityl group is used to anchor the 
oligonucleotide to a solid support via the tertiary carbon and this bond is cleaved during 
mass spectrometry (e.g. MALDI), leaving a positive charge on the desorbing and high 
vacuum flying oligonucleotide. Cpf^ g. ~ - 

C*-^- 0 (L >C— O»'3«rv«c.Ui+id0- 

25 One of skill in the art can readily appreciate several variations to the schemes described 
above. In addition to employing the charge tag array alone, one of skill in the art can 
employ a charge tag array in conjunction with another conditioning means. Particularly 
preferred means to be used in conjunction with the CT include treating the 
phosphodiester bond with trialkylsilyl halides or the phosphomonothiodiester bond with 

30 alkyliodides to render the polyanionic backbone neutral. 
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Another example of conditioning is modification of the phosphodiester 



backbone of the nucleic acid molecule (e.g. cation exchange), which can be useful for 
eliminating peak broadening due to a heterogeneity in the cations bound per nucleotide 
unit. In addition, a nucleic acid molecule can be contacted with an alkylating agent such 
5 as alkyliodide, iodoacetamide, P-iodoethanol, or 2,3-epoxy-l -propanoic t h e monothio 
phosphodiester bonds of a nucleic acid molecule can be transformed into a 
phosphotriester bond. Likewise, phosphodiester bonds may be transformed to uncharged 
derivatives employing trialkylsilyl chlorides. Further conditioning involves incorporating 
nucleotides which reduce sensitivity for depurination (fragmentation during MS) such as 
10 N7- or N9-deazapurine nucleotides, or RNA building blocks or using oligonucleotide 
triesters or incorporating phosphorothioate functions which are alkylated or employing 
oligonucleotide mimetics such as PNA. 



example, using alpha-thio modified nucleotides for chain elongation and termination. 

15 With alkylating agents such as akyliodides, iodoacetamide, P-iodoethanol, 2,3-epoxy-l- 
propanol (see FIGURE 10), the monothio phosphodiester bonds of the nested Sanger 
fragments are transformed into phosphotriester bonds. Multiplexing by mass 
modification in this case is obtained by mass-modifying the nucleic acid primer (UP) or 
the nucleoside triphosphates at the sugar or the base moiety. To those skilled in the art, 

20 other modifications of the nested Sanger fragments can be envisioned. In one 

embodiment of the invention, the linking chemistry allows one to cleave off the so- 
purified nested DNA enzymatically, chemically or physically. By way of example, the L- 
L' chemistry can be of a type of disulfide bond (chemically cleavable, for example, by 
mercaptoethanol or dithioerythrol), a biotin/streptavidin system, a heterobifunctional 

25 derivative of a trityl ether group (Koster ei a/., "A Versatile Acid-Labile Linker for 

Modification of Synthetic Biomolecules," Tetrahedron Letter?; 21, 7095 (1990)) which 
can be cleaved under mildly acidic conditions, a levulinyl group cleavable under almost 
neutral conditions with a hydrazinium/acetate buffer, an arginine-arginine or lysine-lysine 
bond cleavable by an endopeptidase enzyme like trypsin or a pyrophosphate bond 

30 cleavable by a pyrophosphatase, a photocleavable bond which can be, for example, 
physically cleaved and the like (see, e.g., FIGURE 23). Optionally, another cation 



Modification of the phosphodiester backbone can be accomplished by, for 
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exchange can be performed prior to mass spectrometry analysis. In the instance that an 
enzyme-cleavable bond is utilized to immobilize the nested fragments, the enzyme used to 
cleave the bond can serve as an internal mass standard during MS analysis 

The purification process and/or ion exchange process can be carried out by 
5 a number of other methods instead of or in conjunction with, immobilization on a solid 
support. For example, the base-specifically terminated products can be separated from 
the reactants by dialysis, filtration (including ultrafiltration), and chromatography. 
Likewise, these techniques can be used to exchange the cation of the phosphate backbone 
with a counter-ion which reduces peak broadening. 
10 The base-specifically terminated fragment families can be generated by 

standard Sanger sequencing using the Large Klenow fragment of E. coli DNA 
polymerase I, by Sequenase, Taq DNA polymerase and other DNA polymerases suitable 
for this purpose, thus generating nested DNA fragments for the mass spectrometric 
analysis. It is, however, part of this invention that base-specifically terminated RNA 
1 5 transcripts of the DNA fragments to be sequenced can also be utilized for mass 

spectrometric sequence determination. In this case, various RNA polymerases such as 
the SP6 or the T7 RNA polymerase can be used on appropriate vectors containing, for 
example, the SP6 or the T7 promoters (e.g. Axelrod etal. "Transcription from 
Bacteriophage T7 and SP6 RNA Polymerase Promoters in the Presence of 3*- 
20 Deoxyribonucleoside S'-triphosphate Chain Terminators," Bjflgjismiaa 24, 5716-23 
(1985)). In this case, the unknown DNA sequence fragments are-inserted downstream 
from such promoters. Transcription can also be initiated by a nucleic acid primer (Pitulle 
et aL, "Initiator Oligonucleotides for the Combination of Chemical and Enzymatic RNA 
Synthesis," Qaifi H2, 101-105 (1992)) which carries, as one embodiment of this 
invention, appropriate linking functionalities, L, which allow the immobilization of the 
nested RNA fragments, as outlined above, prior to mass spectrometric analysis for 
purification and/or appropriate modification and/or conditioning. 

For this immobilization process of the DNA/RNA sequencing products for 
mass spectrometric analysis, various solid supports can be used, e.g., beads (silica gel, 
controlled pore glass, magnetic beads, Sephadex/Sepharose beads, cellulose beads, etc.), 
capillaries, glass fiber filters, glass surfaces, metal surfaces or plastic material. Examples 
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of useful plastic materials include membranes in filter or microtiter plate formats, the 
latter allowing the automation of the purification process by employing microtiter plates 
which, as one embodiment of the invention, carry a permeable membrane in the bottom of 
the well fiinctionalized with L\ Membranes can be based on polyethylene, polypropylene, 
5 polyamide, polyvinylidenedifluoride and the like. Examples of suitable metal surfaces 
include steel, gold, silver, aluminum, and copper. After purification, cation exchange, 
and/or modification of the phosphodiester backbone of the L-L' bound nested Sanger 
fragments, they can be cleaved off the solid support chemically, enzymatically or 
physically. Also, the L-L' bound fragments can be cleaved from the support when they 

1 0 are subjected to mass spectrometry analysis by using appropriately chosen L-L* linkages 
and corresponding laser energies/intensities as described above and in FIGURES 1 9-23 

The highly purified, four base-specifically terminated DNA or RNA 
fragment families are then analyzed with regard to their fragment lengths via 
determination of their respective molecular weights by MALDI or ES mass spectrometry. 

15 For ES, the samples, dissolved in water or in a volatile buffer, are injected 

either continuously or discontinuously into an atmospheric pressure ionization interface 
(API) and then mass analyzed by a quadrupole. With the aid of a computer program, the 
molecular weight peaks are searched for the known molecular weight of the nucleic acid 
primer (UP) and determined which of the four chain-terminating nucleotides has been 

20 added to the UP. This represents the first nucleotide of the unknown sequence. Then, 
the second, the third, the n** 1 extension product can be identified in a similar manner and, 
by this, the nucleotide sequence is assigned. The generation of multiple ion peaks which 
can be obtained using ES mass spectrometry can increase the accuracy of the mass 
determination. 

25 In MALDI mass spectrometry, various mass analyzers can be used, e.g., 

magnetic sector/magnetic deflection instruments in single or triple quadrupole mode 
(MS/MS), Fourier transform and time-of-flight (TOF) configurations as is known in the 
art of mass spectrometry. FIGURES 2 A through 6 are given as an example of the data 
obtainable when sequencing a hypothetical DNA fragment of 50 nucleotides in length 



30 (SEQ ID NO:3) and having a molecular weight of 1 5,344.02 daltons. The molecular 
weights calculated for the ddT (FIGURES 2A and 2B), ddA (FIGURES 3 A and 3B), 
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ddG (FIGURES 4A and 4B) and ddC (FIGURES 5A and SB) terminated products are 
given (corresponding to fragments of SEQ ID NO:3) and the idealized four MALDI-TOF 
mass spectra shown. All four spectra are superimposed, and from this, the DNA 
sequence can be generated. This is shown in the summarizing FIGURE 6, demonstrating 
how the molecular weights are correlated with the DNA sequence. MALDI-TOF spectra 
have been generated for the ddT terminated products (FIGURE 16) corresponding to 
those shown in FIGURE 2 and these spectra have been superimposed (FIGURE 1 7). 
The correlation of calculated molecular weights of the ddT fragments and their 
experimentally-verified weights are shown in Table 1 . Likewise, if all four chain- 
terminating reactions are combined and then analyzed by mass spectrometry, the 
molecular weight difference between two adjacent peaks can be used to determine the 
sequence. For the desorption/ionization process, numerous matrix/laser combinations 
can be used. 
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Correlation of calculated and experimentally verified molecular weights of the 13 DNA 

fragments of FIGURES 2 and 16. 



Fragment 
(•n-mer) 



calculated mass experimental mass 



difference 
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35 



7-mer 

10- mer 

1 1- mer 

1 9- mer 

20- mer 
24-mer 
26-mer 
33-mer 
3 7-mer 
38-mer 
42-mer 
46-mer 
50-mer 



2104.45 

3011.04 

3315.24 

5771.82 

6076.02 

7311.82 

7945.22 

101 12.63 

11348.43 

11652.62 

12872.42 

14108.22 

15344.02 



2119.9 

3026.1 

3330.1 

5788.0 

6093.8 

7374.9 

7960.9 

10125.3 

11361.4 

11670.2 

12888.3 

14125.0 

15362.6 



+ 15.4 
+15.1 
+14.9 
+16.2 
+ 17.8 
+63.1 
+15.7 
+12.7 
+ 13.0 
+17.6 
+ 15.9 
+ 16.8 
+ 18.6 
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In order to increase throughput to a level necessary for high volume 

genomic and cDNA sequencing projects, a further embodiment of the present invention is 

to utilize multiplex mass spectrometry to simultaneously determine more than one 

sequence. This can be achieved by several, albeit different, methodologies, the basic 

principle being the mass modification of the nucleic acid primer (UP), the chain* 

elongating and/or terminating nucleoside triphosphates, or by using mass-differentiated 

tag probes hybridizable to specific tag sequences. The term "nucleic acid primer* as used 

herein encompasses primers for both DNA and RNA Sanger sequencing. 

By way of example, FIGURE 7 presents a general formula of the nucleic 

acid primer (UP) and the tag probes (TP). The mass modifying moiety can be attached, 

for instance, to either the S'-end of the oligonucleotide (M 1 ), to the nucleobase (or bases) 

2 7, 3 
(M , M ), to the phosphate backbone (M ), and to the 2-position of the nucleoside 

4 6 5 
(nucleosides) (M , M ) or/and to the terminal 3'-position (M ). Primer length can vary 

between 1 and 50 nucleotides in length. For the priming of DNA Sanger sequencing, the 

1 5 primer is preferentially in the range of about 1 5 to 30 nucleotides in length. For 

artificially priming the transcription in a RNA polymerase-mediated Sanger sequencing 
reaction, the length of the primer is preferentially in the range of about 2 to 6 nucleotides. 
If a tag probe (TP) is to hybridize to the integrated tag sequence of a family chain- 
terminated fragments, its preferential length is about 20 nucleotides. 

20 The table in FIGURE 7 depicts some examples of mass-modified primer/tag 

_zr^^I obe configurations for DNA, as well as RNA, Sanger sequencing. This list is, however, 
not meant to be limiting, since numerous other combinations of mass-modifying functions 
and positions within the oligonucleotide molecule are possible and are deemed part of the 
invention. The mass-modifying functionality can be, for example, a halogen, an azido, or 

25 of the type, XR, wherein X is a linking group and R is a mass-modifying functionality. 

The mass-modifying functionality can thus be used to introduce defined mass increments 
into the oligonucleotide molecule. 

In another embodiment, the nucleotides used for chain-elongation and/or 
termination are mass-modified. Examples of such modified nucleotides are shown in 

30 FIGURE 8. Here the mass-modifying moiety, M, can be attached either to the 

nucleobase, M 2 (in case of the c -deazanucleosides also to C-7, M ), to the triphosphate 
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group at the alpha phosphate. M 3 , or to the 2'-position of the sugar ring of the nucleoside 
4 6 

triphosphate, M and M . Furthermore, the mass-modifying functionality can be added 
so as to affect chain termination, such as by attaching it to the S'-position of the sugar 
ring in the nucleoside triphosphate, M 5 The list in FIGURE 8 represents examples of 
5 possible configurations for generating chain-terminating nucleoside triphosphates for 
RNA or DNA Sanger sequencing. For those skilled in the art, however, it is clear that 
many other combinations can serve the purpose of the invention equally well. In the 
same way, those skilled in the art will recognize that chain-elongating nucleoside 
triphosphates can also be mass-modified in a similar fashion with numerous variations anu 
1 0 combinations in functionality and attachment positions. 

Without limiting the scope of the invention, FIGURE 9 gives a more 
detailed description of particular examples of how the mass-modification, M, can be 
introduced for X in XR as well as using ohgoVpolyethylene glycol derivatives for R The 
mass-modifying increment in this case is 44, i.e. five different mass-modified species can 
1 5 be generated by just changing m from 0 to 4 thus adding mass units of 45 (m=0). 89 
(m=l), 133 (m=2), 177 (m=3) and 221 (m=4) to the nucleic acid primer (UP), the tag 
probe (TP) or the nucleoside triphosphates respectively. The oligo/polyethylene glycols 
can also be monoalkylated by a lower alkyl such as methyl, ethyl, propyl, isopropyl, t- 
butyl and the like. A selection of linking functionalities, X, are also illustrated. Other 
20 chemistries can be used in the mass-modified compounds, as for example, those described 
recently in Oligonucleotides and Analogues A Pra ctical Approach F. Eckstein, editor, 
IRL Press, Oxford, 1991. 

In yet another embodiment, various mass-modifying functionalities, R, other 
than oligo/polyethylene glycols, can be selected and attached via appropriate linking 
25 chemistries, X. Without any limitation, some examples are given in FIGURE 10. A 
simple mass-modification can be achieved by substituting H for halogens like F, CI, Br 
and/or I, or pseudohalogens such as SCN, NCS, or by using different alkyl, aryl or aralkyl 
moieties such as methyl, ethyl, propyl, isopropyl, t-butyl, hexyl, phenyl, substituted 
phenyl, benzyl, or functional groups such as CH 2 F, CHF 2 , CF 3 , Si(CH 3 ) 3 . 
Si(CH 3 ) 2 (C 2 H 5 ), Si(CH 3 XC 2 H 5 ) 2 , Si(C 2 H 5 ) 3 . Yet another mass-modification can be 
obtained by attaching homo- or heteropeptides through X to the UP, TP or nucleoside 
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triphosphates. One example useful in generating mass-modified species with a mass 
increment of 57 is the attachment of oligoglycines, e.g., mass-modifications of 74 (r=l, 
m=0), 13 1 (r=l, m=2), 188 (r=l, m=3), 245 m-4) are achieved. Simple 
oligoamides also can be used, e.g., mass-modifications of 74 (r=l, m=0), 88 (r=2, m=0), 
5 102 (r=3, m=0), 1 16 (r=4, m=0), etc. are obtainable. For those skilled in the art, it will 
be obvious that there are numerous possibilities in addition to those given in FIGURE 10 
and the above mentioned reference f Oligonucleotides and Analogic F. Eckstein, 1 99 1), 
for introducing, in a predetermined manner, many different mass-modifying functionalities 
to UP, TP and nucleoside triphosphates which are acceptable for DNA and RNA Sanger 
10 sequencing. 

As used herein, the superscript 0-i designates i + 1 mass differentiated 

nucleotides, primers or tags. In some instances, the superscript 0 (e.g., NTP°, UP°) can 

designate an unmodified species of a particular reactant, and the superscript i (e g NTP 1 
12 

NTP , NTT , etc.) can designate the i-th mass-modified species of that reactant. If, for 
1 5 example, more than one species of nucleic acids (e.g., DNA clones) are to be 

concurrently sequenced by multiplex DNA sequencing, then i + 1 different mass-modified 
nucleic acid primers (UP°, UpV.UP 1 ) can be used to distinguish each set of base- 
specifically terminated fragments, wherein each species of mass-modified UP* can be 
distinguished by mass spectrometry from the rest. 
20 As illustrative embodiments of this invention, three different basic processes 

for multiplex mass spectrometric DNA sequencing employing the described mass- — 
modified reagents are described below: 

A) Multiplexing by the use of mass-modified nucleic acid primers 
(UP) for Sanger DNA or RNA sequencing (see for example FIGURE 1 1 ); 
25 B) Multiplexing by the use of mass-modified nucleoside 

triphosphates as chain elongators and/or chain terminators for Sanger 
DNA or RNA sequencing (see for example FIGURE 12); and 

C) Multiplexing by the use of tag probes which specifically 
hybridize to tag sequences which are integrated into part of the four 
30 Sanger DNA/RNA base-specifically terminated fragment families. Mass 

modification here can be achieved as described for FIGURES 7, 9 and 10, 
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or alternately, by designing different oligonucleotide sequences having the 
same or different length with unmodified nucleotides which, in a 
predetermined way, generate appropriately differentiated molecular 
weights (see for example FIGURE 13). 

The process of multiplexing by mass-modified nucleic acid primers (UP) is 
illustrated by way of example in FIGURE 1 1 for mass analyzing four different DN A 
clones simultaneously. The first reaction mixture is obtained by standard Sanger DNA 
sequencing having unknown DNA fragment 1 (clone 1) integrated in an appropriate 
vector (e.g., M13mpl8), employing an unmodified nucleic acid primer UP°, and a 
standard mixture of the four unmodified deoxynucleoside triphosphates, dNTP°, and 
with l/10th of one of the four dideoxynucleoside triphosphates, ddNTP°. A second 
reaction mixture for DNA fragment 2 (clone 2) is obtained by employing a mass-modified 
nucleic acid primer UP and, as before, the four unmodified nucleoside triphosphates, 
dNTP , containing in each separate Sanger reaction 1/1 0th of the chain-terminating 
unmodified dideoxynucleoside triphosphates ddNTP°. In the other two experiments, the 
four Sanger reactions have the following compositions: DNA fragment 3 (clone 3), UP 2 , 
dNTP , ddNTP° and DNA fragment 4 (clone 4), UP 3 , dNTP°, ddNTP° For mass 
spectrometric DNA sequencing, all base-specifically terminated reactions of the four 
clones are pooled and mass analyzed. The various mass peaks belonging to the four 
dideoxy-terminated (e.g., ddT-terminated) fragment families are assigned to specifically 
elongated and ddT-terminated fragments by searching (such as by a computer program) 
for the known molecular ion peaks of UP°, UP 1 , UP 2 and UP 3 extended by either one of 

the four dideoxynucleoside triphosphates, UP°-ddN°, UP l -ddN°, UP 2 -ddN° and UP 3 - 
0 

ddN In this way, the first nucleotides of the four unknown DNA sequences of clone 1 
to 4 are determined. The process is repeated, having memorized the molecular masses of 
the four specific first extension products, until the four sequences are assigned. 
Unambiguous mass/sequence assignments are possible even in the worst case scenario in 
which the four mass-modified nucleic acid primers are extended by the same 
dideoxynucleoside triphosphate, the extension products then being, for example, UP°- 
ddT. UP -ddT, UP 2 -ddT and UP 3 -ddT, which differ by the known mass increment 
differentiating the four nucleic acid primers. In another embodiment of this invention, an 
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analogous technique is employed using different vectors containing, for example, the SP6 

and/or T7 promoter sequences, and performing transcription with the nucleic acid 

primers UP°, UP 1 , UP 2 and UP 3 and either an RNA polymerase (e.g., SP6 or T7 RNA 

polymerase) with chain-elongating and terminating unmodified nucleoside triphosphates 
0 0 

5 NTP and3'-dNTP Here, the DNA sequence is being determined by Sanger RNA 
sequencing. 

FIGURE 12 illustrates the process of multiplexing by mass-modified chain- 
elongating or/and terminating nucleoside triphosphates in which three different DNA 
fragments (3 clones) are mass analyzed simultaneously. The first DNA Sanger 

10 sequencing reaction (DNA fragment 1, clone 1) is the standard mixture employing 

0 0 

unmodified nucleic acid primer UP , dNTP and in each of the four reactions one of the 
0 

four ddNTP . The second (DNA fragment 2, clone 2) and the third (DNA fragment 3, 

0 0 1 0 0 2 

clone 3) have the following contents: UP , dNTP , ddNTP and UP , dNTP . ddNTP 

, respectively. In a variation of this process, an amplification of the mass increment in 

1 5 mass-modifying the extended DNA fragments can be achieved by either using an equally 

12 

mass-modified deoxynucleoside triphosphate (i.e., dNTP , dNTP ) for chain elongation 
alone or in conjunction with the homologous equally mass-modified dideoxynucleoside 

triphosphate. For the three clones depicted above, the contents of the reaction mixtures 

0 0 0 0 1 0 

can be as follows: either UP /dNTP /ddNTP , UP /dNTP /ddNTP and 

20 UP°/dNTP 2 /ddNTP° or UP°/dNTP 0 /ddNTP°, UP°/dNTP 1 /ddNTP 1 and 
0 2 2 

UP /dNTP /ddNTP described above, DNA sequencing can be performed by 

0 

Sanger RNA sequencing employing unmodified nucleic acid primers, UP , and an 
appropriate mixture of chain-elongating and terminating nucleoside triphosphates. The 
mass-modification can be again either in the chain-terminating nucleoside triphosphate 

25 alone or in conjunction with mass-modified chain-elongating nucleoside triphosphates. 
Multiplexing is achieved by pooling the three base-specifically terminated sequencing 
reactions (e.g., the ddTTP terminated products) and simultaneously analyzing the pooled 
products by mass spectrometry. Again, the first extension products of the known nucleic 
acid primer sequence are assigned, e.g., via a computer program. Mass/sequence 

30 assignments are possible even in the worst case in which the nucleic acid primer is 

extended/terminated by the same nucleotide, e.g., ddT, in all three clones. The following 
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configurations thus obtained can be well differentiated by their different mass- 
modifications: UP°-ddT°, UP°-ddT l , UP°-ddT 2 . 

In yet another embodiment of this invention, DNA sequencing by multiplex 
mass spectrometry can be achieved by cloning the DNA fragments to be sequenced in 
"plex-vectors" containing vector specific "tag sequences" as described (Kdster et al , 
-Oligonucleotide Synthesis and Multiplex DNA Sequencing Using Chemiluminescent 
Detection," Nucle i c Aod&Aes . Symposium Ser. No. 24; 318-321 (1991)); then pooling 
clones from different plex-vectors for DNA preparation and the four separate Sanger 
sequencing reactions using standard dNTP°/ddNTP° and nucleic acid primer UP°; 
purifying the four multiplex fragment families via linking to a solid support through the 
linking group, L, at the 5'-end of UP; washing out all by-products, and cleaving the 
purified multiplex DNA fragments offthe support or using the L-V bound nested Sanger 
fragments as such for mass spectrometric analysis as described above; performing de- 
multiplexing by one-by-one hybridization of specific "tag probes"; and subsequently 
analyzing by mass spectrometry (see, for example, FIGURE 13). As a reference point, 
the four base-specifically terminated multiplex DNA fragment families are run by the 
mass spectrometer and all ddT°-, ddA°-, ddC°- and ddG°-terminated molecular ion 
peaks are respectively detected and memorized. Assignment of, for example, ddT°- 
terminated DNA fragments to a specific fragment family is accomplished by another mass 
spectrometric analysis after hybridization of the specific tag probe (TP) to the 

corresponding tag sequence contained in the sequence of this specific fragment family. 

Only those molecular ion peaks which are capable of hybridizing to the specific tag probe 
are shifted to a higher molecular mass by the same known mass increment (e.g. of the tag 
probe). These shifted ion peaks, by virtue of all hybridizing to a specific tag probe, 
belong to the same fragment family. For a given fragment family, this is repeated for the 
remaining chain terminated fragment families with the same tag probe to assign the 
complete DNA sequence. This process is repeated i-1 times corresponding to i clones 
multiplexed (the 

i-th clone is identified by default). 

The differentiation of the tag probes for the different multiplexed clones can 
be obtained just by the DNA sequence and its ability to Watson-Crick base pair to the tag 
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sequence. It is well known in the art how to calculate stringency conditions to provide 

for specific hybridization of a given tag probe with a given tag sequence (see, for 

example, Molecular Cloning: A laboratory manual 2ed, ed. by Sambrook, Fritsch and 

Maniatis (Cold Spring Harbor Laboratory Press: NY, 1989, Chapter 1 1). Furthermore, 

5 differentiation can be obtained by designing the tag sequence for each plex-vector to have 

a sufficient mass difference so as to be unique just by changing the length or base 

composition or by mass-modifications according to FIGURES 7, 9 and 10. In order to 

keep the duplex between the tag sequence and the tag probe intact during mass 

spectrometric analysis, it is another embodiment of the invention to provide for a covalent 

10 attachment mediated by, for example, photoreactive groups such as psoralen and 

ellipticine and by other methods known to those skilled in the art (see, for example, 

Helene et al y Nature 344 . 358 (1990) and Thuong et aL "Oligonucleotides Attached to 

Intercalators, Photoreactive and Cleavage Agents" in F. Eckstein, Oligonucleotides and 

Analogues: A Practical Approach. IRL Press, Oxford 1991, 283-306). 

1 5 The DNA sequence is unraveled again by searching for the lowest 

0 

molecular weight molecular ion peak corresponding to the known UP -tag sequence/tag 

0 

probe molecular weight plus the first extension product, e.g., ddT , then the seeond, the 
third, etc. 

In a combination of the latter approach with the previously described 

20 multiplexing processes, a further increase in multiplexing can be achieved by using, in 

addition to the tag probeftag sequence interaction, mass-modified nucleic acid primers 

(FIGURE 7) and/or mass-modified deoxynucleoside, dNTP lt and/or dideoxynucleoside 
0-i 

triphosphates, ddNTP . Those skilled in the art will realize that the tag sequence/tag 

probe multiplexing approach is not limited to Sanger DNA sequencing generating nested 

25 DNA fragments with DNA polymerases. The DNA sequence can also be determined by 

transcribing the unknown DNA sequence from appropriate promoter-containing vectors 

^^0-i 0-i 

(see above) with various RN A polymerases and mixtures of NTP /3'-dNTP , thus 
generating nested RNA fragments. 

In yet another embodiment of this invention, the mass-modifying 
30 functionality can be introduced by a two or multiple step process. In this case, the 

nucleic acid primer, the chain-elongating or terminating nucleoside triphosphates and/or 
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the tag probes are, in a first step, modified by a precursor functionality such as azido, - 
N 3 , or modified with a functional group in which the R in XR is H (FIGURE 7, 9) thus 
providing temporary functions, e.g., but not limited to -OH, -NH 2 , -NHR, -SH, -NCS, 
-CM:0(CH 2 ) r COOH (r = 1-20), -NHCO(CH 2 ) r COOH (r = 1-20), -OS0 2 OH, 
-OCOCCH^ (r - 1-20), .OP(0-Alkyi)N(Alky!) 2 . These less bulky functionalities result 
in better substrate properties for the enzymatic DNA or RNA synthesis reactions of the 
DNA sequencing process. The appropriate mass-modifying functionality is then 
introduced after the generation of the nested base-specifically terminated DNA or RNA 
fragments prior to mass spectrometry. Several examples of compounds which can serve 
as mass-modifying functionalities are depicted in FIGURES 9 and 10 without limiting the 
scope of this invention. 

Another aspect of this invention concerns kits for sequencing nucleic acids 
by mass spectrometry which include combinations of the above-described sequencing 
reactants. For instance, in one embodiment, the kit comprises reactants for multiplex 
mass spectrometric sequencing of several different species of nucleic acid. The kit can 
include a solid support having a linking functionality (L 1 ) for immobilization of the base- 
specifically terminated products; at least one nucleic acid primer having a linking group 
(L) for reversibly and temporarily linking the primer and solid support through, for 
example, a photocleavable bond; a set of chain-elongating nucleotides (e.g., dATP, 
dCTP, dGTP and dTTP, or ATP, CTP, GTP and UTP); a set of chain-terminating 
nucleotides (such as T^'-dideoxynucleotides for DNA synthesis_or 3V=deoxynucleotides 
for RNA synthesis); and an appropriate polymerase for synthesizing complementary 
nucleotides. Primers and/or terminating nucleotides can be mass-modified so that the 
base-specifically terminated fragments generated from one of the species of nucleic acids 
to be sequenced can be distinguished by mass spectrometry from all of the others. 
Alternative to the use of mass-modified synthesis reactants, a set of tag probes (as 
described above) can be included in the kit. The kit can also include appropriate buffers 
as well as instructions for performing multiplex mass spectrometry to concurrently 
sequence multiple species of nucleic acids. 

In another embodiment, a nucleic acid sequencing kit can comprise a solid 
support as described above, a primer for initiating synthesis of complementary nucleic 
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acid fragments, a set of chain-elongating nucleotides and an appropriate polymerase. The 
mass-modified chain-terminating nucleotides are selected so that the addition of one of 
the chain terminators to a growing complementary nucleic acid can be distinguished by 
mass spectrometry. 

5 The present invention is further illustrated by the following examples which 

should not be construed as limiting in any way. The contents of all cited references 
(including literature references, issued patents, published patent applications (including 
international patent application Publication Number WO 94/16101, entitled "DNA 
Sequencing by Mass Spectrometry" by H. Koester; and international patent application 
1 0 Publication Number WO 94/21 822 entitled "DNA Sequencing by Mass Spectrometry Via 
Exonuclease Degradation" by H. Koester), and co-pending patent applications, (including 
U.S Patent Application Serial No. 08/406,199, entitled "DNA Diagnostics Based on 
Mass Spectrometry" by H. Koester), as cited throughout this application are hereby 
expressly incorporated by reference. 

15 

EXAMPLE l 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometry analysis via disulfide bonds. 

20 As a solid support, Sequelon membranes (Millipore Corp.. Bedford, MA) 

with phenyl isothiocyanate groups are used as a starting material. The membrane disks, 
with a diameter of 8 mm, are wetted with a solution of N-methylmorphoIine/water/2- 
propanol (NMM solution) (2/49/49 v/v/v), the excess liquid removed with filter paper 
and placed on a piece of plastic film or aluminum foil located on a heating block set to 

25 55°C. A solution of 1 mM 2-mercaptoethylaniine (cysteamine) or 2, 2'-dithio- 

bis(ethylamine) (cystamine) or S-(2-thiopyridyl)-2-thio-ethylamine (10 ul, 10 nmol) in 
NMM is added per disk and heated at 55°C. After 15 min, 10 ul of NMM solution are 
added per disk and heated for another S min. Excess of isothiocyanate groups may be 
removed by treatment with 10 ul of a 10 mM solution of glycine in NMM solution. For 

30 cystamine, the disks are treated with 10 ul of a solution of 1M aqueous dithiothreitol 
(DTT)/2-propanoI (1:1 v/v) for 15 min at room temperature. Then, the disks are 
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thoroughly washed in a filtration manifold with 5 aliquots of 1 ml each of the NMM 
solution, then with 5 aliquots of 1 ml acetonitrile/water (1/1 v/v) and subsequently dried. 
If not used immediately the disks are stored with free thiol groups in a solution of 1M 
aqueous dithiothreitol/2-propanol (1:1 v/v) and, before use, DTT is removed by three 
5 washings with 1 ml each of the NMM solution. The primer oligonucleotides with 5-SH 
functionality can be prepared by various methods (e.g., B.C.F Chu et al , Nucleic 
AcidS Res, 14, 5591-5603 (1986), Sproatef a£, Nucleic ApjLdS-RfiS 4837-48 (1987) 
and Olieonucieotides and Analogues: A Practica l Approach (V Eckstein, editor), IRL 
Press Oxford, 1991). Sequencing reactions according to the Sanger protocol are 

10 performed in a standard way (e.g., H. Swerdlow et a/., Nucleic Acids Rgg 1415-19 
(1990)). In the presence of about 7-10 mM DTT the free S'-thiol primer can be used; in 
other cases, the SH functionality can be protected, e.g., by a trityl group during the 
Sanger sequencing reactions and removed prior to anchoring to the support in the 
following way. The four sequencing reactions (1 50 ul each in an Eppendorf tube) are 

15 terminated by a 10 min incubation at 70°C to denature the DNA polymerase (such as 
Klenow fragment, Sequenase) and the reaction mixtures are ethanol precipitated. The 
supernatants are removfed and the pellets vortexed with 25 ul of an 1M aqueous silver 
nitrate solution, and after one hour at room temperature, 50 ul of an 1 M aqueous 
solution of DTT is added and mixed by vortexing. After 15 min, the mixtures are 

20 centrifuged and the pellets are washed twice with 100 ul ethylacetate by vortexing and 
^centrifugation to remove excess DTT. The primer extension products with free 5-thiol 
group are now coupled to the thioiated membrane supports under mild oxidizing 
conditions. In general, it is sufficient to add the 5-thiolated primer extension products 
dissolved in 10 ul 10 mM de-aerated triethylammonium acetate buffer (TEAA) pH 7.2 to 

25 the thioiated membrane supports. Coupling is achieved by drying the samples onto the 
membrane disks with a cold fan. This process can be repeated by wetting the membrane 
with 1 0 ul of 10 mM TEAA buffer pH 7.2 and drying as before. When using the 2- 
thiopyridyl derivatized compounds, anchoring can be monitored by the release of 
pyridine-2-thione spectrophotometrically at 343 nm 

30 In another variation of this approach, the oligonucleotide primer is 

functional ized with an amino group at the 5'-end which is introduced by standard 
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procedures during automated DNA synthesis. After primer extension, during the Sanger 
sequencing process, the primary amino group is reacted with 3-(2-pyridyldithio) 
propionic acid N-hydroxysuctinimide ester (SPDP) and subsequently coupled to the 
thiolated supports and monitored by the release of pyridyl-2-thione as described above. 
5 After denaturation of DNA polymerase and ethanol precipitation of the sequencing 

products, the supematants are removed and the pellets dissolved in 10 ul 10 inM TEAA 
buffer pH 7.2 and 10 ul of a 2 mM solution of SPDP in 10 mM TEAA are added. The 
reaction mixture is vortexed and incubated for 30 min at 25°C. Excess SPDP is then 
removed by three extractions (vortexing, centrifugation) with 50 ul each of ethanol and 

10 the resulting pellets are dissolved in 10 ul 10 mM TEAA buffer pH 7.2 and coupled to 
the thiolated supports (see above). 

The primer-extension products are purified by washing the membrane disks 
three times each with 100 ul NMM solution and three times with 100 ul each of 1 0 mM 
TEAA buffer pH 7.2. The purified primer-extension products are released by three ' 

15 successive treatments with 10 ul of 10 mM 2-mercaptoethanol in 10 mM TEAA buffer 
pH 7.2, iyophilized and analyzed by either ES or MALDI mass spectrometry. 

This procedure can also be used for the mass-modified nucleic acid primers 

0-i 

UP in an analogous and appropriate way, taking into account the chemical properties 
of the mass-modifying functionalities. 

20 

rr. — EXAMPLE I 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 

for mass spectrometric analysis via the levulinyl group 

25 5- Aminolevulinic acid is protected at the primary amino group with the 

Fmoc group using 9-fluorenyimethyl N-sucdnimidyl carbonate and is then transformed 

into the N-hydroxysuccinimide ester (NHS ester) using N-hydroxysuccinimide and 

dicyclohexyl carbodiimide under standard conditions. For the Sanger sequencing 

0-i 

reactions, nucleic acid primers, UP , are used which are functionalized with a primary 
30 amino group at the 5*-end introduced by standard procedures during automated DNA 
synthesis with aminolinker phosphoamidites as the final synthetic step. Sanger 
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sequencing is performed under standard conditions (see above). The four reaction 

mixtures (150 ul each in an Eppendorf tube) are heated to 70°C for 10 min to inactivate 

the DNA polymerase, ethanol precipitated, centrifuged and resuspended in 10 ul of 10 

mM TEAA buffer pH 7.2. 10 ul of a 2 mM solution of the Fmoc-5-aminolevulinyl-NHS 

5 ester in 10 mM TEAA buffer is added, vortexed and incubated at 25°C for 30 min. The 

excess of the reagent is removed by ethanol precipitation and centrifugation The Fmoc 

. group is cleaved off by resuspending the pellets in 10 ul of a solution of 20% piperidine in 

N,N-dimethylformamide/water (1:1 v/v). After 1 5 min at 25 °C, piperidine is thoroughly 

removed by three precipitauons/centrifugations with 100 ul each of ethanol, the pellets 

10 are resuspended in 1 0 ul of a solution of N-methyimorpholine, 2-propanol and water 

(2/10/88 v/v/v) and are coupled to the solid support carrying an isothiocyanate group In 

the case of the DITC-Sequelon membrane (Millipore Corp., Bedford, MA), the 

membranes are prepared as described in EXAMPLE 1 and coupling is achieved on a 
o 

heating block at 55 C as described above. RNA extension products are immobilized in 
15 an analogous way. The procedure can be applied to other solid supports with 

isothiocyanate groups in a similar manner. 

The immobilized primer-extension products are extensively washed three 

times with 100'ul each of NMM solution and three times with 100 ul 10 mM TEAA 

buffer pH 7.2. The purified primer-extension products are released by three successive 
20 treatments with 10 ul of 100 mM hydraanium acetate buffer pH 6.5, lyophilized and 

analyzed by either ES or MALDI mass spectrometry. 



EXA MPLE 3 

25 Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometric analysis via a trypsin sensitive linkage 

Sequelon DITC membrane disks of 8 mm diameter (Millipore Corp., 
Bedford, MA) are wetted with 10 ul of NMM solution (N-methylmorpholine/propanaol- 
2/water; 2/49/49 v/v/v) and a linker arm introduced by reaction with 10 ul of a 10 mM 

30 solution of 1 ,6-diaminohexane in NMM The excess diamine is removed by three 

washing steps with 100 u! of NMM solution. Using standard peptide synthesis protocols. 
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two L-lysine residues are attached by two successive condensations with N-Fmoc-N- 

tBoc-L-lysine pentafiuorophenylester, the terminal Fmoc group is removed with 

piperidine in NMM and the free a-amino group coupled to 1,4-phenyiene 

diisothiocyanate (DITC). Excess DITC is removed by three washing steps with 100 ul 2- 

propanol and the N-tBoc groups removed with trifluoroacetic add according to standard 

peptide synthesis procedures. The nucleic acid primer-extension products are prepared 

from oligonucleotides which carry a primary amino group at the 5 -terminus. The four 

Sanger DNA sequencing reaction mixtures (150 ul each in Eppendorf tubes) are heated 

for 10 min at 70°C to inactivate the DNA polymerase, ethanol precipitated, and the 

pellets resuspended in 10 ul of a solution of N-methylmorpholine, 2-propanol and water 

(2/10/88 v/v/v). This solution is transferred to the Lys-Lys-DITC membrane disks and 

o 

coupled on a heating block set at 55 C. After drying, 10 ul of NMM solution is added 
and the drying process repeated. 

The immobilized primer-extension products are extensively washed three 
times with 1 00 ul each of NMM solution and three times with 100 ul each of 1 0 mM 
TEAA buffer pH 7.2. For mass spectrometric analysis, the bond between the primer- 
extension products and the solid support is cleaved by treatment with trypsin under 
standard conditions and the released products analyzed by either ES or MALDI mass 
spectrometry with trypsin serving as an internal mass standard. 

EXAMPLE 4 

Immobilization of primer-extension products of Sanger DNA sequencing reaction 
for mass spectrometric analysis via pyrophosphate linkage 

The DITC Sequelon membrane (disks of 8 mm diameter) are prepared as 
described in EXAMPLE 3 and 10 ul of a 10 mM solution of 3-aminopyridine adenine 
dinucleotide (APAD) (Sigma) in NMM solution added. The excess APAD is removed by 
a 10 ul wash of NMM solution and the disks are treated with 10 ul of 10 mM sodium 
periodate in NMM solution (15 min, 25°C). Excess periodate is removed and the 
primer-extension products of the four Sanger DNA sequencing reactions (1 50 ul each in 
Eppendorf tubes) employing nucleic acid primers with a primary amino group at the 5- 



WO 97/37041 



PCT/US97/04394 



-40- 



end are ethanol precipitated, dissolved in 10 ul of a solution of N-methylmorpholine/2- 
propanol/water (2/10/88 v/v/v) and coupled to the T 3 f -dialdehydo groups of the 
immobilized NAD analog. 

The primer-extension products are extensively washed with the NMM 
solution (3 times with 100 ul each) and 10 mM TEAA buffer pH 7.2 (3 times with 100 ul 
each) and the purified primer-extension products are released by treatment with either 
NADase or pyrophosphatase in 10 mM TEAA buffer at pH 7.2 at 37°C for 15 min, 
lyophilized and analyzed by either ES or MALDI mass spectrometry, the enzymes serving 
as internal mass standards. 



EXAMPLE 9 

Synthesis of nucleic acid primers mass-modified by glycine residues at the 5 f - 
position of the sugar moiety of the terminal nucleoside 

Oligonucleotides are synthesized by standard automated DNA synthesis 

using B-cyanoethylphosphoamidites (H. Koster et al. % Nucleic Acids RfiS 12, 4539 

(1984)) and a 5-amino group is introduced at the end of solid phase DNA synthesis (e.g. 

Agrawal et aL s Nugleig Acids Res 14, 6227-45 (1986) or Sproat et aL, Nucleic Acids 

RfiS. 15, 6181-96 (1987)). The total amount of an oligonucleotide synthesis, starting 

with 0.25 umol CPG-bound nucleoside, is deprotected with concentrated aqueous 

TM 

ammonia, purified via OligoPAK Cartridges (Millipore Corp., Bedford, MA) and 
lyophilized. This material with a S'-terminal amino group is dissolved in 100 ul absolute 
N,N-dimethylformamide (DMF) and condensed with 10 fimole N-Fmoc-glycine 
pentafluorophenyl ester for 60 min at 25°C. After ethanol precipitation and 
centrifiigation, the Fmoc group is cleaved off by a 10 min treatment with 100 ul of a 
solution of 20% piperidine in N,N-dimethylformamide. Excess piperidine, DMF and the 
cleavage product from the Fmoc group are removed by ethanol precipitation and the 
precipitate lyophilized from 10 mM TEAA buffer pH 7.2. This material is now either 
used as primer for the Sanger DNA sequencing reactions or one or more glycine residues 
(or other suitable protected amino acid active esters) are added to create a series of mass- 
modified primer oligonucleotides suitable for Sanger DNA or RNA sequencing 
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Immobilization of these mass-modified nucleic acid primers UP^ 1 after primer-extension 
during the sequencing process can be achieved as described, e.g., in EXAMPLES 1 to 4. 

EXAMPLE $ 

5 

Synthesis of nucleic acid primers mass-modified at C-5 or the heterocyclic base of a 
pyrimidine nucleoside with glycine residues 

Starting material was S-Q-aminopropynyl-l)^ 1 5-di-p-tolyldeoxyuridine 
prepared and 3* 5'-de-0-acylated according to literature procedures (Haralambidis etal, 

1 0 Nucleic Acids Res 4857-76 ( 1 987)). 0.28 1 g ( 1 .0 mmole) 5-(3-aminopropynyl- 1 )-2- 
deoxyuridine were reacted with 0.927 g (2.0 mmole) N-Fmoc-glycine 
pentafluorophenylester in 5 ml absolute N,N-dimethylformamide in the presence of 0.129 
g (1 mmole; 174 ul) N.N-diisopropylethylamine for 60 min at room temperature. 
Solvents were removed by rotary evaporation and the product was purified by silica gel 

15 chromatography (Kieselgel 60, Merck; column: 2.5x 50 cm, elution with 

chloroform/methanol mixtures). Yield was 0.44 g (0.78 mmole, 78 %). In order to add 
another glycine residue, the Fmoc group is removed with a 20 min treatment with 20% 
solution of ptperidine in DMF, evaporated in vacuo and the remaining solid material 
extracted three times with 20 ml ethylacetate. After having removed the remaining 

20 ethylacetate. N-Fmoc-glycine pentafluorophenylester is coupled as described above. 5- 
(S-^N-Fmoc-glycyl^amidopropynyl-O-r-deoxyuridine is transformed into the S'-O- 
dimethoxytritylated nucleoside-3'-O-0-cyanoethyl-N,N-diisopropylphosphoamidite and 
incorporated into automated oligonucleotide synthesis by standard procedures (H. Koster 
et a/., Nucleic Acids Res 12, 2261 (1984)). This glycine modified thymidine analogue 

25 building block for chemical DN A synthesis can be used to substitute one or more of the 
thymidine/uridine nucleotides in the nucleic acid primer sequence. The Fmoc group is 
removed at the end of the solid phase synthesis with a 20 min treatment with a 20 % 
solution of piperidine in DMF at room temperature. DMF is removed by a washing step 
with acetonitrile and the oligonucleotide deprotected and purified in the standard way 

30 

EXA M PL E 7 
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Synthesis or a nucleic acid primer mass-modified at C-5 of the heterocyclic base of 
a pyrimidine nucleoside with B-alanine residues 

Starting material was the same as in EXAMPLE 6. 0.28 1 g ( 1 .0 mmole) 
5-(3-Aminopropynyl-l)-2'-deoxyuridine was reacted with N-Fmoc-B-alanine 
pentafluorophenylester (0.955 g, 2.0 mmole) in 5 ml N.N-dimethylformamide (DMF) in 
the presence of 0. 129 g (1 74 ul; 1 .0 mmole) N,N-disopropylethylaniine for 60 min at 
room temperature. Solvents were removed and the product purified by silica gel 
chromatography as described in EXAMPLE 6. Yield was 0.425 g (0.74 mmole, 74 %). 
Another B-alanine moiety can be added in exactly the same way after removal of the 
Fmoc group. The preparation of the 5 -O-dimethoxytritylated nucleoside-3 -O-B- 
cyanoethyl-N,N-diisopropylphosphoamidite from 5-(3-(N-Fmoc-B-alanyl)- 
amidopropynyl-l)-2'-deoxyuridine and incorporation into automated oligonucleotide 
synthesis is performed under standard conditions. This building block can substitute for 
any of the thymidine/uridine residues in the nucleic acid primer sequence. In the case of 
only one incorporated mass-modified nucleotide, the nucleic acid primer molecules 
prepared according to EXAMPLES 6 and 7 would have a mass difference of 14 daltons. 
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EXAMPLE 8 

Synthesis of a nucleic acid primer mass-modified at C-5 of the heterocyclic base of 
a pyrimidine nucleoside with ethylene glycol monomethyl ether 

5 As a nucleosidic component, S^S-aminopropynyM^'-deoxyuridine was 

used in this example (see EXAMPLES 6 and 7). The irtass-modifying functionality was 
obtained as follows: 7.61 g (100.0 mmote) freshly distilled ethylene glycol monomethyl 
ether dissolved in 50 ml absolute pyridine was reacted with 10.01 g (100.0 mmole) 
recrystallized succinic anhydride in the presence of 1.22 g (10.0 mmole) 4-N,N- 

10 dimethylaminopyridine overnight at room temperature. The reaction was terminated by 
the addition of water (5.0 ml), the reaction mixture evaporated in vacuo, co-evaporated 
twice with dry toluene (20 ml each) and the residue redissolved in 100 ml 
dichloromethane. The solution was extracted successively, twice with 10 % aqueous 
citric acid (2 x 20 ml) and once with water (20 ml) and the organic phase dried over 

15 anhydrous sodium sulfate. The organic phase was evaporated in vacuo, the residue 
redissolved in 50 ml dichloromethane and precipitated into 500 ml pentane and the 
precipitate dried in vacuo. Yield was 13. 12 g (74.0 mmole; 74 %). 8.86 g (50.0 mmole) 
of succinylated ethylene glycol monomethyl ether was dissolved in 1 00 ml dioxane 
containing 5% dry pyridine (5 ml) and 6.96 g (50.0 mmole) 4-nitrophenol and 10.32 g 

20 (50.0 mmole) dicyclohexyicarbodiimide was added and the reaction run at room 
temperature for 4 hours. Dicyclohexylurea was removed by filtration, the filtrate 
evaporated in vacuo and the residue redissolved in 50 ml anhydrous DMF 12.5 ml 
(about 12.5 mmole 4-nitrophenylester) of this solution was used to dissolve 2.81 g (10.0 
mmole) 5-(3-aminopropynyI- 1 )-2 , -deoxyuridtne. The reaction was performed in the 

25 presence of 1.01 g (10.0 mmole; 1.4 ml) triethylamine at room temperature overnight 

The reaction mixture was evaporated in vacuo, co-evaporated with toluene, redissolved 
in dichloromethane and chromatographed on silicagel (Si60, Merck; column 4x50 cm) 
with dichloromethane/methanol mixtures. The fractions containing the desired compound 
were collected, evaporated, redissolved in 25 ml dichloromethane and precipitated into 

30 250 ml pentane. The dried precipitate of 5-(3-N-(0-succinyl ethylene glycol monomethyl 
ether)-amidopropynyl-l)-2'-deoxyuridine (yield: 65 %) is 5-O-dimethoxytritylated and 
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transformed into the nucleosideO'-O-fl-cyanoethyl-N, N-diisopropylphosphoamidite and 
incorporated as a building block in the automated oligonucleotide synthesis according to 
standard procedures. The mass-modified nucleotide can substitute for one or more of the 
thymidine/uridine residues in the nucleic acid primer sequence. Deprotection and 
purification of the primer oligonucleotide also follows standard procedures. 
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EXAMPLE 9 

Synthesis of a nucleic acid primer mass-modified at C-5 of the heterocyclic base of 
a pyrimidine nucleoside with diethylene glycol monomethyl ether 

S Nucleosidic starting material was as in previous examples, 5-(3- 

aminopropynyl- 1 )-2 f -deoxyuridine. The mass-modifying functionality was obtained 
similar to EXAMPLE 8. 12.02 g (100.0 mmole) freshly distilled diethylene glycol 
monomethyl ether dissolved in 50 ml absolute pyridine was reacted with 10 01 g (100.0 
mmole) recrystallized succinic anhydride in the presence of 1 .22 g (10.0 mmole) 4-N, N- 

10 dimethylaminopyridine (DMAP) overnight at room temperature. The work-up was as 
described in EXAMPLE 8. Yield was 18.35 g (82.3 mmole, 82.3 %). 1 1.06 g (50.0 
mmole) of succinylated diethylene glycol monomethyl ether was transformed into the 4- 
nitrophenylester and, subsequently, 12.5 mmole was reacted with 2.81 g (10.0 mmole) of 
5-(3-aminopropynyl-l)-2'-deoxyuridine as described in EXAMPLE 8. Yield after silica 

1 5 gel column chromatography and precipitation into pentane was 3.34 g (6.9 mmole, 69 

%). After dimethoxytritylation and transformation into the nucleoside-B- 

cyanoethylphosphoamidite, the mass-modified building block is incorporated into 

automated chemical DNA synthesis according to standard procedures. Within the 

0-i 

sequence of the nucleic acid primer UP , one or more of the thymidine/uridine residues 
20 can be substituted by this mass-modified nucleotide. In the case of only one incorporated 
mass-modified nucleotide, the nucleic acid primers e£EXAMPLES 8 and 9 would have a 
mass difference of 44.05 daltons. 

EXAMPLE 10 

25 

Synthesis of a nucleic acid primer mass-modified at C-8 of the heterocyclic base of 
deoxyadenosine with glycine 

6 

Starting material was N -benzoyl-8-bromo-5 , -0(4 1 4 , -dimethoxytrityl)-2'- 
deoxyadenosine prepared according to literature (Singh et aL, Nucleic Acids Res. 18, 
30 3339-45 (1990)). 632.5 mg (1.0 mmole) of this 8-bromo-deoxyadenosine derivative was 
suspended in 5 ml absolute ethanol and reacted with 251.2 mg (2.0 mmole) glycine 
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methyl ester (hydrochloride) in the presence of 241 .4 mg (2. 1 mmole; 366 ul) N, N- 
diisopropylethylamine and refluxed until the starting nucleoside material had disappeared 
(4-6 hours) as checked by thin layer chromatography (TLC). The solvent was 
evaporated and the residue purified by silica gel chromatography (column 2.5x50 cm) 
5 using solvent mixtures of chioroform/methanoi containing 0. 1 % pyridine. The product 
fractions were combined, the solvent evaporated, the fractions dissolved in 5 ml 
dichloromethane and precipitated into 100 ml pentane. Yield was 487 mg (0.76 mmole, 
76 %). Transformation into the corresponding nucleoside-B-cyanoethylphosphoamidite 
and integration into automated chemical DNA synthesis is performed under standard 
10 conditions. During final deprotection with aqueous concentrated ammonia, the methyl 
group is removed from the glycine moiety. The mass-modified building block can 
substitute one or more deoxyadenosine/adenosine residues in the nucleic acid primer 
sequence. 

15 EXAMPLE 11 

Synthesis of a nucleic acid primer mass-modified at C-8 of the heterocyclic base of 
deor> adenosine with glycyiglycine 

This derivative was prepared in analogy to the glycine derivative of 
20 EXAMPLE 10. 632.5 mg (1.0 mmole) ^-Benzoyl-S-bromo-S -O^,^- 

dimethoxytFrty^'-deoxyadenosine was suspended in 5 ml absolute ethanol and reacted 
with 324.3 mg (2.0 mmole) glycyl-glycine methyl ester in the presence of 24 1 .4 mg (2. 1 
mmole, 366 pi) 

N, N-diisopropylethylamine. The mixture was refluxed and completeness of the reaction 
25 checked by TLC. Work-up and purification was similar to that described in EXAMPLE 
10. Yield after silica gel column chromatography and precipitation into pentane was 464 
mg (0.65 mmole, 65 %). Transformation into the nucleoside-!)- 
cyanoethylphosphoamidite and into synthetic oligonucleotides is done according to 
standard procedures. In the case where only one of the deoxyadenosine/adenosine 
30 residues in the nucleic acid primer is substituted by this mass-modified nucleotide, the 



£DOCtO:<WO 9737041A3> 




WO 97/37041 PCT/US97/04394 

. - 47 - 

mass difference between the nucleic acid primers of EXAMPLES 10 and 11 is 57.03 daltons 

EXAMPLE 12 

S Synthesis of a nucleic acid primer mass-modified at the C-2' of the sugar moiety of 
2 f -amino-2'-deoxythyinidine with ethylene glycol monomethyl ether residues 

Starting materia) was 5 -O-^^-dimethoxytrityl^-amino^- 
deoxythymidine synthesized according to published procedures (e.g., Verheyden et a/., L 
Org Chem . 2£, 250-254 (1971); Sasaki etal, J Or e Chem £L, 3138-3143 (1976); 

10 Imazawag/^ . J Org Chem 44. 2039-2041 (1979); Hobbs etal y J Org Chem 42. 

714-719 (1976); Ikehara et aL % Chem, Pharm. Bull. Japan 26, 240-244 (1978); see also 
PCT Application WO 88/00201). 5 , -0-(4,4-Dimethoxytrityl)-2 , -amino-2 , - 
deoxythymidine (559.62 mg; 1 .0 mmole) was reacted with 2.0 mmole of the 4- 
nitrophenyl ester of succinylated ethylene glycol monomethyl ether (see EXAMPLE 8) in 

15 10 ml dry DMF in the presence of 1.0 mmole (140 jil) triethyl amine for 18 hours at room 
temperature. The reaction mixture was evaporated in vacuo, co-evaporated with 
toluene, redissolved in dichloromethane and purified by silica gel chromatography (Si60, 
Merck; column: 2.5x50 cm; eluent: chloroform/methanol mixtures containing 0.1 % 
triethylamine). The product containing fractions were combined, evaporated and 

20 precipitated into pentane. Yield was 524 mg (0.73 mmol; 73 %). Transformation into 
the nucleoside-B-cyanoethyl-N,N-diisopropylphosphoamidite and incorporatieft-into the 
automated chemical DNA synthesis protocol is performed by standard procedures. The 
mass-modified deoxythymidine derivative can substitute for one or more of the thymidine 
residues in the nucleic acid primer. 

25 In an analogous way, by employing the 4-nitrophenyl ester of succinylated 

diethylene glycol monomethyl ether (see EXAMPLE 9) and methylene glycol 
monomethyl ether, the corresponding mass-modified oligonucleotides are prepared. In 
the case of only one incorporated mass-modified nucleoside within the sequence, the 
mass difference between the ethylene, diethylene and methylene glycol derivatives is 

30 44.05, 88. 1 and 132. 1 5 daltons respectively. 
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EXAMPLE 13 

Synthesis of a nucleic acid primer mass-modified in the internucieotidic linkage via 
aikylation of phosphorothioate groups 

Phosphorothioate-containing oligonucleotides were prepared according to 
standard procedures (see e.g. Gait et at., Nuclei AriH« R f c & j 183 (1991)). One, 
several or all internucleotide linkages can be modified in this way. The (-)-Ml 3 nucleic 
acidprimer sequence (17-mer) 5'-dGTAAAACGACGGCCAGT was synthesized in 0.25 
pmole scale on a DNA synthesizer and one phosphorothioate group introduced after the 
final synthesis cycle (G to T coupling) Sulfurization, deprotection and purification 
followed standard protocols. Yield was 3 1.4 nmole (12.6 % overall yield), 
corresponding to 31.4 nmole phosphorothioate groups. Aikylation was performed by 
dissolving the residue in 3 1 .4 pi TE buffer (0.01 M Tris pH 8.0, 0.001 M EDTA) and by 
adding 16 pi of a solution of 20 mM solution of 2-iodoethanol (320 nmole; i.e., 10-fold 
excess with respect to phosphorothioate diesters) in N,N-dimethylformamide (DMF). 
The alkylated oligonucleotide was purified by standard reversed phase HPLC (RP-18 
Ultraphere, Beckman; column: 4.5 x 250 mm; 100 mM triethylammonium acetate, pH 7.0 
and a gradient of 5 to 40 % acetonitrile). 

In a variation of this procedure, the nucleic acid primer containing one or 
more phosphorothioate phosphodiester bond is used in the Sanger sequencing reactions 
The primer-extension products of the-feor sequencing reactions are purified as 
exemplified in EXAMPLES 1 - 4. cleaved off the solid support, lyophilized and dissolved 
in 4 pi each of TE buffer pH 8.0 and alkylated by addition of 2 ul of a 20 mM solution of 
2-iodoethanol in DMF. It is then analyzed by ES and/or MALDI mass spectrometry. 

In an analogous way, employing instead of 2-iodoethanol, e.g., 3- 
iodopropanol, 4-iodobutanol mass-modified nucleic acid primer are obtained with a mass 
difference of 14.03, 28.06 and 42.03 daltons respectively compared to the unmodified 
phosphorothioate phosphodiester-containing oligonucleotide. 



EXAMPLE 14 
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Synthesis of 2'-amino-2'-deoxyuridine-5'-triphosphate and 3'-amino-2 , 9 3 t - 
dideoxythymidine-5'-triphosphate mass-modified at the 2'- or 3-amino function 
with glycine or O-alanine residues 

Starting material was 2'-azido-2'-deoxyuridine prepared according to 
5 literature (Verheyden et aL. J Org Chem 36, 250 (1971)), which was 4,4- 

dimethoxytrityiated at 5-OH with 4,4-dimethoxytrityl chloride in pyridine and acetylated 
at 3'-OH with acetic anhydride in a one-pot reaction using standard reaction conditions. 
With 191 mg (0.71 mmole) 2'-azido-2'-deoxyuridtne as starting material, 396 mg (0.65 
mmol, 90.8 %) 5 , -0-(4,4-dimethoxytrityl)-3 , -0-acetyl-2 -azido^-deoxuridine was 

10 obtained after purification via silica gel chromatography. Reduction of the azido group 
was performed using published conditions (Barta et aL, Tetrahedron 46 587-594 
(1990)). Yield of 5-0-(4,4-dimethoxytrityl)0^ after 
silica gel chromatography was 288 mg (0.49 mmole; 76 %). This protected 2'-amino-2- 
deoxyuridine derivative (588 mg, 1.0 mmole) was reacted with 2 equivalents (927 mg, 

15 2.0 mmole) N-Fmoc-glycine pentafiuorophenyl ester in 10 ml dry DMF overnight at 
room temperature in the presence of 1.0 mmole (174 pi) N,N-diisopropylethylamine. 
Solvents were removed by evaporation in vacuo and the residue purified by silica gel 
chromatography. Yield was 71 1 mg (0.71 mmole, 82 %). Detritylation was achieved by 
a one hour treatment with 80% aqueous acetic acid at room temperature. The residue 

20 was evaporated to dryness, co-evaporated twice with toluene, suspended in 1 ml dry 

acetonitrile and 5-phosphorylated with POCI3 according to literature (Yoshikawa^/ aL, 
Bull, Chem, Sac Japan £2, 3505 (1969) and Sowa et aL, Bull Chem Soc Japan 4& 
2084 (1975)) and directly transformed in a one-pot reaction to the 5 -triphosphate using 3 
ml of a 0.5 M solution (1.5 mmole) tetra (tri-n-butylammonium) pyrophosphate in DMF 

25 according to literature (e.g. Seela et aL , Helvetica Chimica Acta 24* 1 048 ( 1 99 1 )). The 
Fmoc and the 3-O-acetyl groups were removed by a one-hour treatment with 
concentrated aqueous ammonia at room temperature and the reaction mixture evaporated 
and lyophilized. Purification also followed standard procedures by using anion-exchange 
chromatography on DEAE-Sephadex with a linear gradient of triethylammonium 

30 bicarbonate (0. 1 M - 1 .0 M). Triphosphate containing fractions (checked by thin layer 
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chromatography on polyethyleneimine cellulose plates) were collected, evaporated and 
lyophtlized. Yield (by UV-absorbance of the uracil moiety) was 68% (0.48 mmole). 

A glycyl-glycine modified 2 , -amino-2 , -deoxyuridine-5 , -triphosphate was 
obtained by removing the Fmoc group from 5 , -0-(4 % 4-dimethoxytrityl)-3 , -0-acetyl-2 , -N. 
(N-9-nuorenylmethyloxycaifcon by a one-hour 

treatment with a 20% solution of piperidine in DMF at room temperature, evaporation of 
solvents, two-fold co-evaporation with toluene and subsequent condensation with N- 
Fmoc-glycine pentafluorophenyl ester. Starting with 1 .0 mmole of the 2'-N-glycyl-2'- 
amino-2'-deoxyuridine derivative and following the procedure described above, 0.72 
mmole (72%) of the corresponding 2 , -(N•glycy^glycyl)-2 , -aminc>-2 , -deoxyuridine•5 , . 
triphosphate was obtained. 

Starting with 5-0-(4,4-dimethoxytrityl)-3 -O-acetyl^-amino-T- 
deoxyuridine and coupling with N-Fmoc-B-alanine pentafluorophenyl ester, the 
corresponding 2 , -(N-B-alanyl)-2 , -amino-2 , .deoxyuridine-5 , -triphosphate can be 
synthesized. These modified nucleoside triphosphates are incorporated during the Sanger 
DNA sequencing process in the primer-extension products. The mass difference between 
the glycine, B-alanine and glycyl-glycine mass-modified nucleosides is, per nucleotide 
incorporated, 58.06, 72.09 and 115.1 daltons respectively 



dideoxythymidine (obtained by published procedures, see EXAMPLE 12), the 
corresponding S'-^-glycyO-S^ammo-/ 3^-N-glyc*l-glycyl)-3 , -amino-/ and 3 f -(N-B- 
alanyl)-3 , -amino-2',3 -dideoxythymidine-5'-triphosphates can be obtained. These mass- 
modified nucleoside triphosphates serve as a terminating nucleotide unit in the Sanger 
DNA sequencing reactions providing a mass difference per terminated fragment of 58.06, 
72.09 and 1 15. 1 daltons respectively when used in the multiplexing sequencing mode. 
The mass-differentiated fragments can then be analyzed by ES and/or MALDI mass 
spectrometry. 



When starting with SMD-^^-dimethoxytrityO-S'-amino-Z^'- 



EX AMPLE 15 



30 
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Synthesis of deoxyuridine~5'-triphosphate mass-modified at C-5 of the heterocyclic 
base with glycine, glycyi-gfycine and 0-alanine residues. 



EXAMPLE 6) was reacted with either 0.927 g (2.0 mmole) N-Fmoc-glycine 
pentafluorophenylester or 0.955g (2.0 mmole) N-Fmoc-B-alanine pentafluorophenyl ester 
in 5 ml dry DMF in the presence of 0. 129 g N, N-diisopropyiethylamine (1 74 ul, 1.0 
mmole) overnight at room temperature. Solvents were removed by evaporation in vacuo 
and the condensation products purified by flash chromatography on silica gel (Still et al., 



glycine and 436 mg (0.76 mmole; 76%) for the C-alanine derivatives. For the synthesis of 
the glycyl-glycine derivative, the Fmoc group of 1 .0 mmole Fmoc-glycine-deoxyuridine 
derivative was removed by one-hour treatment with 20% piperidine in DMF at room 
temperature. Solvents were removed by evaporation in vacuo, the residue was co- 
evaporated twice with toluene and condensed with 0.927 g (2.0 mmole) N-Fmoc-glycine 
pentafluorophenyl ester and purified as described above. Yield was 445 mg (0.72 mmole; 
72%). The glycyl-, glycyl-glycyl- and B-alanyl-2-deoxyuridine derivatives, N-protected 
with the Fmoc group were transformed to the 3-O-acetyl derivatives by tritylation with 
4,4-dimethoxytrityl chloride in pyridine and acetylation with acetic anhydride in pyridine 
in a one-pot reaction and subsequently detritylated by one hour treatment with 80% 
aqueous acetic acid according to standard procedures. Solvents were removed, the 
residues dissolved in 100 ml chloroform and extracted twice with 50 ml 10% sodium 
bicarbonate and once with 50 ml water, dried with sodium sulfate, the solvent evaporated 
and the residues purified by flash chromatography on silica gel. Yields were 361 mg 
(0.60 mmole; 71%) for the glycyl-, 351 mg (0.57 mmole; 75%) for the B-alanyl- and 323 
mg (0.49 mmole; 68%) for the glycyl-glycyl-3-0'-acetyI-2 '-deoxyuridine derivatives 
respectively. Phosphorylation at the 5-OH with POCI3, transformation into the 5'- 
triphosphate by in-situ reaction with tetra(tri-n-butylammoniurn) pyrophosphate in DMF, 
3 , -de-0-acetylation, cleavage of the Fmoc group, and final purification by anion-exchange 
chromatography on DEAE-Sephadex was performed as described in EXAMPLE 14. 
Yields according to UV-absorbance of the uracil moiety were 0.41 mmole 5-(3-(N- 
glycy!)-amidopropynyl-l)-2 , -deoxyuridine-5 , -triphosphate (84%), 0.43 mmole 5-(3-(N-B- 



0.281 g (1.0 mmole) S-p-AminopropynyM^'-deoxyuridine (see 




£2, 2923-2925 (1978)). Yields were 476 mg (0.85 mmole: 85%) for the 
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danyl^amdopropynyl-l)^^ (75%) and 0.38 mmole 5-(3- 

(N-glycyl-glycyl)-amidopropynyl-l)-2'-dTO (78%). 

These mass-modified nucleoside triphosphates were incorporated during the 
Sanger DNA sequencing primer-extension reactions. 
5 When using 5-(3-aminopropynyl- 1 )-2',3 -dideoxyuridine as starting material 

and following an analogous reaction sequence the corresponding glycyl-, glycyl-glycyl- 
and ^alanyl-2\3'-dideoxyuridine-5*-triphosphates were obtained in yields of 69, 63 and 
71% respectively. These mass-modified nucleoside triphosphates serve as chain- 
terminating nucleotides during the Sanger DNA sequencing reactions. The mass- 
10 modified sequencing ladders are analyzed by either ES or MALDI mass spectrometry. 

EXAMPLE 16 

Synthesis of 8-glycyl- and S-glycyl-glycyW-deoxyadenosine-S -triphosphate 

1 5 727 mg ( 1 .0 mmole) of N 6 -(4-tert-butylphenoxyacetyl>8-glycyl-5-(4,4- 

dimethoxytrityl)-2 - deoxyadenosine or 800 mg (1.0 mmole) N 6 -(4-tert- 
butylphenoxyacetyl)-8-glycyI-gty^ prepared 
according to EXAMPLES 10 and 1 1 and literature (Koster et ai f Tetrahedron 27, 362 
(1981)) were acetylated with acetic anhydride in pyridine at the 3 -OH, detritylated at the 

20 5-position with 80% acetic acid in a one-pot reaction and transformed into the 5'- 
triphosphates via phosphorylation with POCl 3 and reaction m-^ffiTwith tetra(tri-n- 
butylammonium) pyrophosphate as described in EXAMPLE 14 Deprotection of the N 6 - 
tert-butylphenoxyacetyl, the 3 , -0-acetyI and the O-methyl group at the glycine residues 
was achieved with concentrated aqueous ammonia for ninety minutes at room 

25 temperature. Ammonia was removed by lyophilization and the residue washed with 
dichloromethane, solvent removed by evaporation in vacuo and the remaining solid 
material purified by anion-exchange chromatography on DEAE-Sephadex using a linear 
gradient of triethylammonium bicarbonate from 0. 1 to 1.0 M. The nucleoside 
triphosphate containing fractions (checked by TLC on polyethyleneimine cellulose plates) 

30 were combined and lyophiilized. Yield of the 8-glycyl-2 , -deoxyadenosine-5 l -triphosphate 
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(determined by UV-absorbanee of the adenine moiety) was 57% (0.57 mmole). The yield 
for the S-glycyl-glycyl-^-deoxyadenosine-S'-triphosphate was 51% (0.51 mmole). 

These mass-modified nucleoside triphosphates were incorporated during 
primer-extension in the Sanger DNA sequencing reactions. 

When using the corresponding N6-(4-tert-butylphenoxyacetyl)-8-glycyl- or 
-glycyl-glycyl-5-0-(4,4^imethoxytrityl^^ derivatives as starting 

materials prepared according to standard procedures (see, e.g., for the introduction of the 
2\3'-fiincti6n: Seela et al y Helvetica Chimica Acta 24, 1048-1058 (1991)) and using an 
analogous reaction sequence as described above, the chain-terminating mass-modified 
nucleoside triphosphates 8-glycyl- and 8-glycyl-glycyl-2\3'-dideoxyadenosine-5 - 
triphosphates were obtained in 53 and 47% yields respectively. The mass-modified 
sequencing fragment ladders are analyzed by either ES or MALDI mass spectrometry. 

EXAMPLE 17 

Mass-modification of Sanger DNA sequencing fragment ladders by incorporation 
of chain-elongating 2'-deoxy- and chain-terminating 2',3'-dideoxythymtdine-S'- 
(alpha-S-)-triphosphate and subsequent alkylation with 2-iodoethanol and 3- 
iodopropanol 

2 , ,3 , -Dideoxythymidine-5-(alpha-S)-triphosphate was prepared according to 
• published procedures (e.g., for the alpha-S-triphosphate moiety: Eckstein etai 9 

Biochemistry H 1685 (1976) and Accounts Chem Res 12, 204 (1978) and for the 2\3'- 
dideoxy moiety: Seela et a/., Helvetica Chimica Acta^ 24, 1048-1058 (1991)). Sanger 
DNA sequencing reactions employing 2 , -deoxythymidine-5-(alpha-S)-triphosphate are 
performed according to standard protocols (e.g. Eckstein, Ann Rev Btochem 54, 367 
(1985)). When using 2\3 , -dideoxythymidine-5 , -(alpha-S)-triphosphates, this is used 
instead of the unmodified 2\3'-dideoxythymidine- 5 -triphosphate in standard Sanger DNA 
sequencing (see e.g. Swerdlow et aL t Nucleic Acids Res Ifi, 1 4 1 5- 1 4 1 9 ( 1 990)). The 
template (2 pmole) and the nucleic acid Ml 3 sequencing primer (4 pmole) modified 
according to EXAMPLE 1 are annealed by heating to 65°C in 100 ul of 10 mM Tris-HCI 
pH 7.5, 10 mM MgCl 2 . 50 mM NaCl, 7 mM dhhiothreitol (DTT) for 5 min and slowly 
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brought to 37°C during a one hour period. The sequencing reaction mixtures contain, as 

exemplified for the T-specific termination reaction, in a final volume of 150 ul, 200 uM 

(final concentration) each of dATP, dCTP, dTTP, 300 uM c7-deaza-dGTP, 5 uM 2\3- 

dideoxythymidine-5Xalpha-S)-triphosphate and 40 units Sequenase (United States 

5 Biochemicals). Polymerization is performed for 10 min at 37°C, the reaction mixture 
o 

heated to 70 C to inactivate the Sequenase, ethanol precipitated and coupled to thiolated 
Sequelon membrane disks (8 mm diameter) as described in EXAMPLE I . Alkylation is 
performed by treating the disks with 10 ul of 10 mM solution of either 2-iodoethanol or 
3-iodopropanol in NMM (N-methylmorpholine/water/2-propanoI, 2/49/49, v/v/v) (three 
10 times), washing with 10 ul NMM (three times) and cleaving the alkylated T-terminated 
primer-extension products off the support by treatment with DTT as described in 
EXAMPLE 1 Analysis of the mass-modified fragment families is performed with either 
ES or MALDI mass spectrometry. 

15 EXAMPLE IS 

Analysis of a Mixture of Oligothymidylic Acids 

Oligothymidylic acid, oligo p(dT) 12 -i8, is commercially available (United 
States Biochemical, Cleveland, OH). Generally, a matrix solution of 0.5 M in ethanol 

20 was prepared. Various matrices were used for this Example and Examples 19-21 such 
as 3,5-dihydroxybenzoic acid, sinapinic acid, 3-hydroxypicqlinic acid, 2,4,6- 
trihydroxyacetophenone. Oligonucleotides were lyophilized after purification by HPLC 
and taken up in ultrapure water (MilliQ, Millipore) using amounts to obtain a 
concentration of 10 pmoles/^1 as stock solution. An aliquot (1 pi) of this concentration 

25 or a dilution in ultrapure water was mixed with 1 \i\ of the matrix solution on a flat metal 
surface serving as the probe tip arid dried with a fan using cold air. In some experiments, 
cation-ion exchange beads in the add form were added to the mixture of matrix and 
sample solution. 

MALDI-TOF spectra were obtained for this Example and Examples 19-21 
30 on different commercial instruments such as Vision 2000 (Hnnigan-MAT), VG TofSpec 
(Fisons Instruments), LaserTec Research (Vestec). The conditions for this Example were 
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linear negative ion mode with an acceleration voltage of 25 kV. The MALDI-TOF 

spectrum generated is shown in FIGURE 14. Mass calibration was done externally and 

generally achieved by using defined peptides of appropriate mass range such as insulin, 

gramicidin S, trypsinogen, bovine serum albumen, and cytochrome C. All spectra were 

5 generated by employing a nitrogen laser with 5 nsec pulses at a wavelength of 337 nm. 

6 7 2 

Laser energy varied between 10 and 10 W/cm . To improve signal-to-noise ratio 
generally* the intensities of 10 to 30 laser shots were accumulated. 
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EXAMPLE \ 9 

Mass Spectrometry Analysis of a 50-mer and a 99-mer 

Two large oligonucleotides were analyzed by mass spectrometry. The 50- 
mer 

d (TAACGGTCATTACGGCCATTGACTGTAGGACCTGCATTACATGACTAGCT) 
(SEQ ID NO:3) and dT(pdT) 99 were used. The oligodeoxynucleotides were synthesized 
using p-cyanoethylphosphoamidites and purified using published procedures.(e.g. N.D. 
Sinha, J. Biemat, J. McManus and H. Koster, Nucleic Acids Res J2» 4539 (1984)) 
employing commercially available DNA synthesizers from either Millipore (Bedford, MA) 
or Applied Biosystems (Foster City, CA) and HPLC equipment and RP18 reverse phase 
columns from Waters (Milford, MA). The samples for mass spectrometric analysis were 
prepared as described in Example 18. The conditions used for MALDI-MS analysis of 
each oligonucleotide were 500 fmol of each oligonucleotide, reflectron positive ion mode 
with an acceleration of 5 kV and postacceleration of 20 kV. The MALDI-TOF spectra 
generated were superimposed and are shown in FIGURE 1 5. 

EXAMPLE 20 

Simulation of the DNA Sequencing Results of FIGURE 2 

The 13 DNA sequences representing the nested dT-terminated fragments 
of the Sanger DNA sequencing for the 50-mer described in Example 19 (SEQ ID NO:3) 
were synthesized as described in Example 19. The samples were treated and 500 fmol of 
each fragment was analyzed by MALDI-MS as described in Example 18. The resulting 
MALDI-TOF spectra are shown in FIGURE 16. The conditions were reflectron positive 
ion mode with an acceleration of 5 kV and postacceleration of 20 kV. Calculated 
molecular masses and experimental molecular masses are shown in Table 1 

The MALDI-TOF spectra were superimposed (FIGURE 1 7) to 
demonstrate that the individual peaks are resolvable even between the 10-mer and 1 1-mer 
(upper panel) and the 37-mer and 38-mer (lower panel). The two panels show two 
different scales and the spectra analyzed at that scale. 
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EXAMPLE 21 

MALDI-MS Analysis of a Mass-Modified Oligonucleotide 

5 A 17-mer was mass-modified at C-5 of one or two deoxyuridine moieties. 

5-[ 1 3-(2-Methoxyethoxyl)-tridecyne- 1 -yl]-5 , -0-(4 > 4 , -dimethoxyt^ity!)-2 f -deoxyuridine-3 , - 
P-cyanoethyl-N, N-ditsopropylphosphoamidite was used to synthesize the modified 17- 
mers using the methods described in Example 19. 
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The modified 17-mers were 

X 
I 

a: d (TAAAACGACGGCCAGUG) (molecular mass: 5454) 
5 (SEQ ID NO:4) 



X X 

I I 
b: d ( U AAAACG ACGGCC AGUG) (molecular mass 5634) 
10 (SEQ ID NO:5) 



where X = -OC^CHj)] ,-OH 
(unmodified 17-mer: molecular mass: 5273) 

15 

The samples were prepared and 500 fmol of each modified 1 7-mer was 
analyzed using MALDI-MS as described in Example 18. The conditions used were 
reflection positive ion mode with an acceleration of 5 kV and postacceleration of 20 kV. 
The MAJLDI-TOF spectra which were generated were superimposed and are shown in 
20 FIGURE 18. 

EXAMPLE 11 

Detection of Polymerase Chain Reaction Products Containing 7-Deazapurine 

25 

MATERIALS AND METHODS 

PCR amplifications 

The following oligodeoxynucleotide primers were either synthesized 
according to standard phosphoamidite chemistry (Sinha. N.D,. et al., (1983) Tetrahedron 
30 Let Vol. 24. Pp. 5843-5846; Sinha, N.D., et al., (1984) Nucleic Acids Res. . Vol. 12, Pp. 
4539-4557) on a MilHGen 7500 DNA synthesizer (Millipore. Bedford, MA, USA) in 200 
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nmol scales or purchased from MWG-Biotech (Ebersberg, Germany, primer 3) and 
Biometra (Goettingen, Germany, primers 6-7). 





primer 1: 


5 ' - 


GTCACCCTCGACCTGCAG 


SEQ. ID. NO. 6); 


5 


primer 2: 


5 ' - 


TTGTAAAACGACGGCCAGT 


(SEQ. ID. NO. 7); 




primer 3 : 


5 ' - 


CTTCCACCGCGATGTTGA 


(SEQ. ED. NO. 8); 




primer 4: 


5'- 


CAGGAAACAGC TATGAC 


(SEQ. ID. NO. 9); 




primer 5: 


5'- 


GTAAAACGACGGCCAGT 


(SEQ. ID. NO. 10); 




primer 6: 


5 ' - 


GTCACCCTCGACCTGCAgC 


(g: RiboG) (SEQ. ID. NO. 1 1); 


10 


primer 7: 


5 ' - 


GTTGTAAAACGAGGGCCAgT 


(g: RiboG) (SEQ. ID. NO. 12); 



The 99-mer and 200-mer DNA strands (modified and unmodified) as well 
as the ribo- and 7-deaza-modified 100-mer were amplified from pRFcl DNA (10 ng, 
generously supplied S. Feyerabend, University of Hamburg) in 100 fiL reaction volume 
containing 10 mmol/L KC1, 10 mmol/L (NH 4 ) 2 S0 4 , 20 mmol/L Tris HCI (pH « 8.8), 2 

1 5 mmol/L MgS0 4 , (exo(-)Pseuclococcus furiosus (Pfu) -Buffer, Pharmacia, Freiburg, 

Germany), 0.2 mmol/L each dNTP (Pharmacia, Freiburg, Germany), I p.mol/L of each 
primer and 1 unit of exo(-)Pfu DNA polymerase (Stratagene, Heidelberg, Germany). 

For the 99-mer primers 1 and 2, for the 200-mer primers 1 and 3 and for 
the 100-mer primers 6 and 7 were used. To obtain 7-deazapurine modified nucleic acids, 

20 during PCR-amplification dATP and dGTP were replaced with 7-deaza-dATP and 7- 
deaza-dGTP. The reaction was performed in a thermal cycler (OmniGene, MWG- 
Biotech, Ebersberg, Germany) using the cycle: denaturation at 95 °C for 1 min., annealing 
at 5 1 °C for 1 min. and extension at 72°C for I min. For all PCRs the number of reaction 
cycles was 30 The reaction was allowed to extend for additional 10 min. at 72 °C after 

25 the last cycle. 

The 103-mer DNA strands (modified and unmodified) were amplified 
from M13mpl8 RFI DNA (100 ng, Pharmacia, Freiburg, Germany) in 100 *iL reaction 
volume using primers 4 and 5 all other concentrations were unchanged. The reaction was 
performed using the cycle: denaturation at 95 °C for 1 min., annealing at 40°C for 1 min. 
30 and extension at 72°C for 1 min. After 30 cycles for the unmodified and 40 cycles for 
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the modified 103-mer respectively, the samples were incubated for additional 10 min. at 
72°C. 

Synthesis of S'-f 2 -P)-labeled PCR-primers 

Primers 1 and 4 were 5'-[ -PJ-labeled employing T4-polynucleotidkinase 
(Epicentre Technologies) and (y- 32 P)-ATP. (BLU/NGG/502A, Dupont, Germany) 
according to the protocols of the manufacturer. The reactions were performed 
substituting 10% of primer 1 and 4 in PCR with the labeled primers under otherwise 
unchanged reaction-conditions. The amplified DNAs were separated by gel 
electrophoresis on a 10% potyacrylamide gel. The appropriate bands were excised and 
counted on a Packard TRI-CARB 460C liquid scintillation system (Packard, CT, USA). 

Primer-cleavage from ribo-modified PCR-product 
The amplified DNA was purified using Ultrafree-MC filter units (30,000 
NMWL), it was then redissolved in 100 \x\ of 0.2 mol/L NaOH and heated at 95°C for 25 
minutes. The solution was then acidified with HC1 (1 mol/L) and further purified for 
MALDI-TOF analysis employing Ultrafree-MC filter units (10,000 NMWL) as described 
below. 

Purification of PCR products 

All samples were purified and concentrated using Ultrafree-MC units 
30000 NMWL (Millipore, Eschborn, Germany) according to the manufacturer's 
description. After lyophilisation, PCR products were redissolved in 5 jiL (3 fiL for the 
200-mer) of uitrapure water. This analyte solution was directly used for MALDI-TOF 
measurements. 

MALDI-TOF MS 

Aliquots of 0.5 nL of analyte solution and 0.5 nL of matrix solution (0.7 
mol/L 3 -HP A and 0.07 mol/L ammonium citrate in acetonhrile/water (1: 1, v/v)) were 
mixed on a flat metallic sample support. After drying at ambient temperature the sample 
was introduced into the mass spectrometer for analysis. The MALDI-TOF mass 
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spectrometer used was a Finnigan MAT Vision 2000 (Finnigan MAT, Bremen, 

Germany). Spectra were recorded in the positive ion reflector mode with a 5 keV ion 

source and 20 keV postacceieration. The instrument was equipped with a nitrogen laser 

-8 

(337 nm wavelength). The vacuum of the system was 3-4» 10 hPa in the analyzer 
•7 

5 region and l-4» 10 hPa in the source region. Spectra of modified and unmodified DNA 
samples were obtained with the same relative laser power; external calibration was 
performed with a mixture of synthetic oligodeoxynucleotides (7-to50-mer). 



RESULTS ANP P1SCUSSI0N 
10 Enzymatic synthesis of 7-deazapurine nucleotide containing nucleic 

acids by PCR 

In order to demonstrate the feasibility of MALDI-TOF MS for the rapid, 
gel-free analysis of short PCR products and to investigate the effect of 7-deazapurine 
modification of nucleic acids under MALDI-TOF conditions, two different primer- 

15 template systems were used to synthesize DNA fragments. Sequences are displayed in 
Figures 24 and 25. While the two single strands of the 103-mer PCR product had nearly 
equal masses (Am- 8 u), the two single strands of the 99-mer differed by 526 u. 

Considering the facts that 7-deaza purine nucleotide building blocks for 
chemical DNA synthesis are approximately 160 times more expensive than regular ones 

20 (Product Information, Glen Research Corporation, Sterling, VA) and their application in 
standard P-cyano-phosphoamidke-chemistry is not trivial (Product Information, Glen 
Research Corporation, Sterling, VA; Schneider , K and B .T. Chait (1995) Nucleic Acids 
/?es.23, 1570) the cost of 7-deaza purine modified primers would be very high. 
Therefore, to increase the applicability and scope of the method, all PCRs were 

25 performed using unmodified oligonucleotide primers which are routinely available. 

Substituting dATP and dGTP by c ? -dATP and c ? -dGTP in polymerase chain reaction led 
to products containing approximately 80% 7-deaza-purine modified nucleosides for the 
99-mer and 103-mer, and about 90% for the 200-mer, respectively. Table II shows the 
base composition of all PCR products. 
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TABLE D 

Base composition of the 99-mer, 103-mer and 200-mer PCR amplification products 
(unmodified and 7-deaza purine modified) 



10 



15 



20 



25 



1- 



De- 
fragments 1 


C 


T 


A 


G 


200-mers 


54 


34 


56 


56 


modified 


54 


34 


6 


5 


200-mer s 










200-mer a 


56 


56 


34 


54 


modified 


56 


56 


3 


4 


200-mer a 










103-mer s 


28 


23 


24 


28 


modified 


28 


23 


6 


5 


103-mer s 










103-mer a 


28 


24 


23 


28 


modified 


28 


24 


7 


4 


103-mer a 










99-mer s 


34 


21 


24 


20 


modified 99- 


34 


21 


6 


5 


mer s 










99-mer a 


20 


24 


21 


34 


modified 99- 


20 


24 


3 


4 


mer a 











c'-deaza- c'-deaza- 



50 



31 



18 



16 



18 



18 



51 



50 



23 



24 



15 



30 



rei. 
modificati 
on 2 



90% 



92% 



79V, 



78% 



75* 



87* 



V and "a" describe , *sense w and "antisense" strands of the double-stranded PCR 
product. 

indicates relative modification as percentage of 7-deaza purine modified nucleotides of 
total amount of purine nucleotides. 



30 



35 



However, it remained to be determined whether 80-90% 7-deaza-purine 
modification would be sufficient for accurate mass spectrometer detection. It was 
therefore important to determine whether all purine nucleotides could be substituted 
during the enzymatic amplification step. It was found that exo(-yPseudococcus Juriosus 
(Pju) DN A polymerase indeed could accept c 7 -dATP and c 7 -dGTP in the absence of 
unmodified purine triphosphates. However, the incorporation was less efficient leading 
to a lower yield of PCR product (Figure 26). Ethidium-bromide stains by intercalation 
with the stacked bases of the DNA-doublestrand. Therefore lower band intensities in the 
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ethidium-bromide stained gel might be artifacts since the modified DNA-strands do not 
necessarily need to give the same band intensities as the unmodified ones. 

To verify these results, the PCRs with [ 32 P]-labeled primers were 
5 repeated. The autoradiogram (Figure 27) clearly shows lower yields for the modified 

PCR-products. The bands were excised from the gel and counted. For all PCR products 
the yield of the modified nucleic acids was about 50%, referring to the corresponding 
unmodified amplification product. Further experiments showed that exo(-)DeepVent and 

7 7 

Vent DNA polymerase were able to incorporate c -dATP and c -dGTP during PCR as 
10 well. The overall performance, however, turned out to be best for the exo(-)P/u DNA 
polymerase giving least side products during amplification. Using all three polymerases, 

7 7 

it was found that such PCRs employing c -dATP and c -dGTP instead of their isosteres 
showed less side-reactions giving a cleaner PCR-product. Decreased occurrence of 
amplification side products may be explained by a reduction of primer mismatches due to 

15 a lower stability of the complex formed from the primer and the 7-deaza-purine 

containing template which is synthesized during PCR. Decreased melting point for DNA 
duplexes containing 7-deaza-purine have been described (Mizusawa, S. et al., (1986) 
Nucleic Acids Res*. 1 4, 1 3 1 9- 1 3 24). In addition to the three polymerases specified above 
(exo(-) Deep Vent DNA polymerase. Vent DNA polymerase and exo(-) (P/u) DNA 

20 polymerase), it is anticipated that other polymerases, such as the Large KJenow fragment 
of E.coli DNA polymerase, Sequenase, Taq DNA polymerase, and U AmpliTaq, 
AmpHTaq or AmpliTaq TS DNA polymerase can be used. In addition, where RN A is the 
template, RN A polymerases, such as the SP6 or the T7 RNA polymerase, must be used 

25 MALDI-TOF mass spectrometry of modified and unmodified PCR 

products. 

The 99-mer, 103-mer and 200-mer PCR products were analyzed by 
MALDI-TOF MS. Based on past experience, it was known that the degree of 
depurination depends on the laser energy used for desorption and ionization of the 
30 analyte. Since the influence of 7-deazapurine modification on fragmentation due to 
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depurination was to be investigated, all spectra were measured at the same relative laser 
energy. 

Figures 28a and 28b show the mass spectra of the modified and 
unmodified 103-mer nucleic acids. In case of the modified 103-mer, fragmentation 
5 causes a broad (M+H) signal. The maximum of the peak is shifted to lower masses so 
that the assigned mass represents a mean value of (M+H) + signal and signals of 
fragmented ions, rather than the (M+H) + signal itself. Although the modified 103-mer 
still contains about 20% A and G from the oligonucleotide primers, it shows less 
fragmentation which is featured by much more narrow and symmetric signals Especially 

1 0 peak tailing on the lower mass side due to depurination, is substantially reduced. Hence, 
the difference between measured and calculated mass is strongly reduced although it is 
still below the expected mass. For the unmodified sample a (M+H) + signal of 3 1670 was 
observed, which is a 97 u or 0.3% difference to the calculated mass. While, in case of the 
modified sample this mass difference diminished to 10 u or 0.03% (31713 u found, 31723 

1 5 u calculated). These observations are verified by a significant increase in mass resolution 
of the (M+H) signal of the two signal strands (nVAm = 67 as opposed to 1 8 for the 
unmodified sample with Am = full width at half maximum, fwhm). Because of the low 
mass difference between the two single strands (8 u) their individual signals were not 
resolved. 

20 With the results of the 99 base pair DNA fragments the effects of 

increased mass resolution for 7-deazapurine containing DNA becomes even more 
evident. The two single strands in the unmodified sample were not resolved even though 
the mass difference between the two strands of the PCR product was very high with 526 
u due to unequal distribution of purines and pyrimidines (figure 29a). In contrast to this, 

25 the modified DNA showed distinct peaks for the two single strands (figure 29b) which 
makes the superiority of this approach for the determination of molecular weights to gel 
electrophoretic methods even more profound. Although base line resolution was not 
obtained the individual masses were abled to be assigned with an accuracy of 0. 1%: Am 
= 27 u for the lighter (calc. mass = 30224 u) and Am = 14 u for the heavier strand (calc. 

30 mass = 30750 u). Again, it was found that the full width at half maximum was 
substantially decreased for the 7-deazapurine containing sample. 
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In case of both the 99-mer and 103-mer the 7-deazapurine containing 
nucleic acids seem to give higher sensitivity despite the fact that they still contain about 
20% unmodified purine nucleotides. To get comparable signal-to-noise ratio at similar 
intensities for the (M+H) signals, the unmodified 99-mer required 20 laser shots in 
5 contrast to 12 for the modified one and the 103-mer required 12 shots for the unmodified 
sample as opposed to three for the 7-deazapurine nucleoside-containing PCR product. 

Comparing the spectra of the modified and unmodified 200-mer 
amplicons, improved mass resolution was again found for the 7-deazapurine containing 
sample as well as increased signal intensities (figures 30a and 30b). While the signal of 

10 the single strands predominates in the spectrum of the modified sample the DNA-suplex 
and dinners of the single strands gave the strongest signal for the unmodified sample. 

A complete 7-deaza purine modification of nucleic acids may be achieved 
either using modified primers in PCR or cleaving the unmodified primers from the 
partially modified PCR product. Since disadvantages are associated with modified 

1 5 primers, as described above, a 1 00-mer was synthesized using primers with a ribo- 

modification. The primers were cleaved hydrolytically with NaOH according to a method 
developed earlier in our laboratory (Koester, H. et al., Z Physiol. Chem. 359: 1570- 
1589). Figures 31a and 31b display the spectra of the PCR product before and after 
primer cleavage. Figure 3 lb shows that the hydrolysis was successful: Both hydrolyzed 

20 PCR product as well as the two released primers could be detected together with a small 
signal from residual uncleaved-LOO-mer. This procedure is especially useful for the 
MALDI-TOF analysis of very short PCR-products since the share of unmodified purines 
originating from the primer increases with decreasing length of the amplified sequence. 

The remarkable properties of 7-deazapurine modified nucleic acids can be 

25 explained by either more effective desorption and/or ionization, increased ion stability 
and/or a lower denaturation energy of the double stranded purine modified nucleic acid. 
The exchange of the N-7 for a methine group results in the loss of one acceptor for a 
hydrogen bond which influences the ability of the nucleic acid to form secondary 
structures due to non- Watson-Crick base pairing (Seela, F. and A. Kehne ( 1 987) 

30 Biochemistry, 26, 2232-2238 ), which should be a reason for better desorption during the 
MALDI process. In addition to this the aromatic system of 7-deazapurine has a lower 
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electron density that weakens Watson-Crick base pairing resulting in a decreased melting 
point (Mizusawa, S. et al„ (1986) Nucleic Acids Res.. 14, 1319-1324) of the double- 
strand. This effect may decrease the energy needed for denaturation of the duplex in the 
MALDI process. These aspects as well as the loss of a site which probably will carry a 
5 positive charge on the N-7 nitrogen renders the 7-deazapurine modified nucleic acid less 
polar and may promote the effectiveness of desorption 

Because of the absence of N-7 as proton acceptor and the decreased 
polarizaiton of the C-N bond in 7-deazapurine nucleosides depurination following the 
mechanisms established for hydrolysis in solution is prevented. Although a direct 
10 correlation of reactions in solution and in the gas phase is problematic, less fragmentation 
due to depurination of the modified nucleic acids can be expected in the MALDI process. 
Depurination may either be accompanied by loss of charge which decreases the total yield 
of charged species or it may produce charged fragmentation products which decreases 
the intensity of the non fragmented molecular ion signal. 
1 5 The observation of both increased sensitivity and decreased peak tailing of 

the (M+H) signals on the lower mass side due to decreased fragmentation of the 7- 
deazapurine containing samples indicate that the N-7 atom indeed is essential for the 
mechanism of depurination in the MALDI-TOF process. In conclusion, 7-deazapurine 
containing nucleic acids show distinctly increased ion-stability and sensitivity under 

20 MALDI-TOF conditions and therefore provide for higher mass accuracy and mass 

resolution. 

EXAMPIF 23 

25 Solid State Sequencing and Mass Spectrometer Detection 

MATER1 AT .S AND METHODS 

Oligonucleotides were purchased from Operon Technologies (Alameda, 
CA) in an unpurified form. Their sequences are listed in Table III Sequencing reactions 
30 were performed on a solid surface using reagents from the sequencing kit for Sequenase 
Version 2.0 (Amersham, Arlington Heights, Illinois). 
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Sequenting <? ?9-m?r target 
Sequencing complex: 
5 ' -TCTGGCCTGGTGCAGGGCCTATTGTAGTTGTGACGTACA- (A b ) a -3 9 
5 (DNA1 1683) (SEQ. ID. No. 13) 

3 1 TCAACACTGCATGT-5 ' (PNA16/DNA) (SEQ. ID. No. 14) 



DNA1 1683 was 3 l -biotinylated by terminal deoxynucleotidyl transferase. A 30 pi 
10 reaction, containing 60 pmol of DNA1 1683, 1.3 nmol of biotin 14-dATP (GIBCO BRL, 
Grand Island, NY), 30 units of terminal transferase (Amersham, Arlington Heights, 



hour. The reaction was stopped by heat inactivation of the terminal transferase at 70 °C 
for 10 min. The resulting product was desalted by passing through a TE-10 spin column 

15 (Clonetech). More than one molecules of biotin- 14-dATP could be added to the 3 -end 
of DNA1 1683. The biotinylated DNA1 1683 was incubated with 0.3 mg of Dynal 
streptavidin beads in 30 \il Ix binding and washing buffer at ambient temperature for 30 
min. The beads were washed twice with TE and redissolved in 30 pi TE, 10 pi aliquot 
(containing 0. 1 mg of beads) was used for sequencing reactions. 

20 The 0. 1 mg beads from previous step were resuspended in a lOpl volume 

containing 2 pi of 5x Sequenase buffer (200 mM Tris-H£XpH 7.5, 100 mM MgC12, and 
250 mM NaCl) from the Sequenase kit and 5 pmol of corresponding primer 
PNA16/DNA. The annealing mixture was heated to 70°C and allowed to cool slowly to 
room temperature over a 20-30 min time period. Then 1 pi 0.1 M dithiothreitol solution, 

25 1 ill Mn buffer (0. 1 5 M sodium isocitrate and 0. 1 M McC 1 2), and 2 pi of diluted 

Sequenase (3.25 units) were added. The reaction mixture was divided into four aliquots 
of 3 jil each and mixed with termination mixes (each consists of 3 pi of the appropriate 
termination mix: 32 pM c7dATP, 32 pM dCTP, 32 pM c7dGTP, 32 pM dTTP and 3.2 
jiM of one of the four ddTNPs, in 50 mM NaCl). The reaction mixtures were incubated 

30 at 37°C for 2 min. After the completion of extension, the beads were precipitated and 



In order to perform solid-state DNA sequencing, template strand 



Illinois), and lx reaction buffer (supplied with enzyme), was incubated at 37° C for 1 
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the supernatant was removed. The beads were washed twice and resuspended in TE and 
keptat4°C. 

Sequencing a 7R-mrr inr ?P t 
5 Sequencing complex: 

5 ' -AAGATCTGACCAGGGATTCGGTTAGCGTGACTGCTGCTGCTGCTGCTGCTGC 
TGGATGATCCGACGCATCAGATCTGG- (A b ) n -3 (SEQ. ID. NO. 15) (TNR.PLASM2) 
3 ' -CTACTAGGCTGCGTAGTC- 5 ' ( CM1 ) (SEQ. ID. NO. 1 6) 

The target TNR.PLASM2 was biotinylated and sequenced using 
1 0 procedures similar to those described in previous section (sequencing a 39-mer target). 

Seauencimr a l.S-mer target with partiall y duplex pm f»> 

Sequencing complex: 
5 ' -F-GATGATCCGACGCATCACAGCTC 3 ' (SEQ m N ° l7) 
15 3 '-b-CTACTAGGCTGCGTAGTGTCGAGAACCTTGGCT 3 ' (SE ^ W N ° ,8) 

CM1B3B was immobilized on Dynabeads M280 with streptavidin (Dynal, 
Norway) by incubating 60 pmol of CM1B3B with 0.3 magnetic beads in 30 ul 1M NaCl 
and TE (lx binding and washing buffer) at room temperature for 30 min. The beads 

20 were washed twice with TE and redissolved in 30 pi TE, 10 or 20 pi aliquot (containing 
0. 1 or 0.2 mg-of-beads respectively) was used for sequencing reactions. 

The duplex was formed by annealing corresponding aliquot of beads from 
previous step with 10 pmol of DF1 la5F (or 20 pmol of DF1 laSF for 0.2 mg of beads) in 
a 9 pi volume containing 2 pi of 5x Sequenase buffer (200 mM Tris-HCI, pH 7.5, 100 

25 mM MgCI 1 , and 250 mM NaCl) from the Sequenase kit. The annealing mixture was 

heated to 65 °C and allowed to cool slowly to 37°C over a 20-30 min time period. The 
duplex primer was then mixed with 10 pmol ofTSlo (20 pmol of TS10 for 0.2 mg of 
beads) in 1 pi volume, and the resulting mixture was further incubated at 37°C for 5 min, 
room temperature for 5-10 min. Then I pi 0.1 M dithiothreitol solution, 1 ul Mn buffer 

30 (0. 1 5 M sodium isocitrate and 0. 1 M MnCI 2 ). and 2 pi of diluted Sequenase (3 .25 units) 
were added. The reaction mixture was divided into four aliquots of 3 pi each and mixed 
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with termination mixes (each consists of 4 fjil of the appropriate termination mix: 16 \iM 
dATP, 16 jiM dCTP, 16 jaM dGTP, 16 fiM dTTP and 1.6 of one of the four 
ddNTPs, in 50 mM NaCI). The reaction mixtures were incubated at room temperature 
for 5 min, and 37°C for 5 min. After the completion of extension, the beads were 
5 precipitated and the supernatant was removed. The beads were resuspended in 20 \il TE 
and kept at 4°C. An aliquot of 2 |il (out of 20 jxl) from each tube was taken and mixed 
with 8 \il of form amide, the resulting samples were denatured at 90-95 °C for 5 min and 2 
jil (out of 10 til total) was applied to an ALF DNA sequencer (Pharmacia, Piscataway, 
NJ) using a 1 0% polyacrylamide gel containing 7 M urea and 0.6x TBE. The remaining 
10 aliquot was used for MALDI-TOFMS analysis. 

MA LP/ sample preparation and instrumentation 
Before MALDI analysis, the sequencing ladder loaded magnetic beads 
were washed twice using 50 mM ammonium citrate and resuspended in 0.5 \il pure 
15 water. The suspension was then loaded onto the sample target of the mass spectrometer 
and 0.5 nl of saturated matrix solution (3-hydropicolinic acid (HP A): ammonium citrate 
_ ==10:1 mole ratio in 50% acetonitrile) was added. The mixture was allowed to dry prior 
to mass spectometer analysis. 

The reflectron TOFMS mass spectrometer (Vision 2000, Finnigan MAT, 
20 Bremen, Germany) was used for analysis. 5 kV was applied in the ion source and 20 kV 
was applied for postacceleration. All spectra were-taken in the positive ion mode and a 
nitrogen laser was used. Normally, each spectrum was averaged for more than 1 00 shots 
and a standard 25-point smoothing was applied. 

25 RESULTS AND DISCUSSIONS 

Convention al solid-state sequencing 

In conventional sequencing methods, a primer is directly annealed to the 
template and then extended and terminated in a Sanger dideoxy sequencing. Normally, a 
biotinylated primer is used and the sequencing ladders are captured by streptavidin- 
30 coated magnetic beads. After washing, the products are eluted from the beads using 

EDTA and formamide. However, our previous findings indicated that only the annealed 



-MXX3D: <WO 9737041 A3> 



WO 97/37041 



- 70- 



PCT/US97/04394 



strand of a duplex is desorbed and the immobilized strand remains on the beads (Tang, K. 
et al., (1 995) Nucleic Acids Research 233 126-3 13 1 ). Therefore, it is advantageous to 
immobilize the template and anneal the primer. After the sequencing reaction and 
washing, the beads with the immobilized template and annealed sequencing ladder can be 
loaded directly onto the mass spectrometer target and mix with matrix. In MALDI, only 
the annealed sequencing ladder will be desorbed and ionized, and the immobilized 
template will remain on the target. 

A 39-mer template (SEQ. ED. No. 13) was first biotinylated at the 3' end 
by adding biotin- 14-dATP with terminal transferase. More than one biotin- 14-dATP 
molecule could be added by the enzyme. However, since The template was immobilized 
and remained on the beads during MALDI, the number of biotin- 14-dATP would not 
affect the mass spectra. A 14-mer primer (SEQ. ID. No. 14) was used for the solid-state 
sequencing. MALDI-TOF mass spectra of the four sequencing ladders are shown in 
Figure 32, and the expected theoretical values are shown in Table III. The sequencing 
reaction produced a relatively homogenous ladder, and the full-length sequence was 
determined easily. One peak around 5150 appeared in all reactions are not identified. A 
possible explanation is that a small portion of the template formed some kind of 
secondary structure, such as a loop, which hindered sequenase extension. Mis- 
incorporation is of minor importance, since the intensity of these peaks were much lower 
than that of the sequencing ladders. Although 7-deaza purines were used in the 
sequencing reaction, which could stabilize the N-glycosidic bond and prevent 
depurination, minor base losses were still observed since the primer was not substituted 
by 7-deazapurines. The full length ladder, with a ddA at the 3* end, appeared in the A 
reaction with an apparent mass of 1 1899.8. However, a more intense peak of 122 
appeared in all four reactions and is likely due to an addition of an extra nucleotide by the 
Sequenase enzyme. 
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The same technique could be used to sequence longer DNA fragments. A 
78-mer template containing a CTG repeat (SEQ. ID. No. 15) was 3-biotinylated by 
adding biotin-14-dATP with terminal transferase. An I8*mer primer (SEQ. ID. No. 16) 
was annealed right outside the CTG repeat so that the repeat could be sequenced 
5 immediately after primer extension. The four reactions were washed and analyzed by 
MALDI-TOFMS as usual. An example of the G-reaction is shown in Figure 33 and the 
expected sequencing ladder is shown in Table IV with theoretical mass values for each 
ladder component. All sequencing peaks were well resolved except the last component 
(theoretical value 20577.4) was indistinguishable from the background. Two neighboring 

10 sequencing peaks (a 62-mer and a 63-mer) were also separated indicating that such 
sequencing analysis could be applicable to longer templates. Again, an addition of an 
extra nucleotide by the Sequenase enzyme was observed in this spectrum. This addition 
is not template specific and appeared in all four reactions which makes it easy to be 
identified. Compared to the primer peak, the sequencing peaks were at much lower 

15 intensity in the long template case. Further optimization of the sequencing reaction may 
be required. 
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Sequencing usine duplex DNA probes for cavtormz andnrimmz 
Duplex DNA probes with single-stranded overhang have been demonstrated to 
be able to capture specific DNA templates and also serve as primers for solid-state sequencing. 
The scheme is shown in Figure 34. Stacking interactions between a duplex probe and a single- 

5 stranded template allow only 5-base overhand to be sufficient for capturing. Based on this 
format, a 5' fluorescent-labeled 23-mer (S'-GAT GAT CCG ACG CAT CAC AGC TC) (SEQ. 
ID. No. 19) was annealed to a 3 , -biotinylated 18-mer (5'-GTG ATG CGT CGG ATC ATC) 
(SEQ. ID. No. 20), leaving a 5-base overhang. A 15-mer template (S'-TCG GTT CCA AGA 
GCT) (SEQ ID. No. 21) was captured by the duplex and sequencing reactions were performed , 

10 by extension of the 5-base overhang. MALDI-TOF mass spectra of the reactions are shown in 
Figure 35. All sequencing peaks were resolved although at relatively low intensities. The last 
peak in each reaction is due to unspecific addition of one nucleotide to the full length extension 
product by the Sequenase enzyme. For comparison, the same products were run on a 
conventional DNA sequencer and a stacking fluorogram of the results is shown in Figure 36. 

1 5 As can be seen from the Figure, the mass spectra had the same pattern as the fluorogram with 
sequencing peaks at much lower intensity compared to the 23-mer primer. 

Improvements of MALDU TOF mass spectrometry as a detection technique 
Sample distribution can be made more homogenous and signal intensity could 

20 potentially be increased by implementing the picoliter vial technique. In practice, the samples 
can be loaded on small pits with square openings of 100 um size. The beads used in the solid- 
state sequencing is less than 10 um in diameter, so they should fit well in the microliter vials. 
Microcrystals of matrix and DNA containing "sweet spots" will be confined in the vial Since 
the laser spot size is about 100 \am in diameter, it will cover the entire opening of the vial. 

25 Therefore, searching for sweet spots will be unnecessary and high repetition-rate laser (e.g. 
>10Hz) can be used for acquiring spectra. An earlier report has shown that this device is 
capable of increasing the detection sensitivity of peptides and proteins by several orders of 
magnitude compared to conventional MALD1 sample preparation technique. 

Resolution of MALD1 on DNA needs to be further improved in order to extend the 

30 sequencing range beyond 100 bases. Currently, using 3-HP A/ammonium citrate as matrix and a 
reflectron TOF mass spectrometer with 5kV ion source and 20 kV postacceleration, the 
resolution of the run-through peak in Figure 33 (73-mer) is greater than 200 (FWHM) which is 
enough for sequence determination in this case. This resolution is also the highest reported for 
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MALDI desorbed DNA ions above the 70-mer range. Use of the delayed extraction technique 
may further enhance resolution. 

All of the above-cited references and publications are hereby incorporated by 

reference. 

5 EOIJTVAT.RNTR 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, numerous equivalents to the specific procedures described herein. 
Such equivalents are considered to be within the scope of this invention and are covered by the 
following claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION; 
5 (i) APPLICANT: Koster, Hubert 



(ii) TITLE OF INVENTION : DNA SEQUENCING BY MASS SPECTROMETRY 



(iii) NUMBER OF SEQUENCES: 21 



10 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Patent Group 

Foley, Hoag & Eliot LLP i 

(B) STREET: One Post Office Square 
15 (C) CITY: Boston 

(D) STATE: MA 
<E) COUNTRY: USA 
(F) ZIP: 02109-2170 

20 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

fC) OPERATING SYSTEM: PC-DOS/MS-DOS 
(D) SOFTWARE: ASCII (text) 



25 



30 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 18-MAR-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

< A) APPLICATION NUMBER: 08/617,010 
(B) FILING DATE: 18-MAR-1996 



35 (viii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/178,216 
(B> FILING DATE: 06-JAN-1994 
(C) CLASSIFICATION: 



40 (ix) ATTORNEY/AGENT INFORMATION: 
(A) NAME: Arnold, Beth E. 
iB) REGISTRATION NUMBER: 35,430 
(C) REFERENCE/ DOCKET NUMBER: SQA-3.2 5.27 
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(x) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 832-1294 

(B) TELEFAX: (617) 832-7000 



5 



(2) INFORMATION FOR SEQ ID NO: 1 : 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 



10 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

15 

(iii) HYPOTHETICAL: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

20 

CATGCCATGG CATG 



(2) INFORMATION FOR SEQ ID NO : 2 : 
25 (i) SEQUENCE CHARACTERISTICS: 



30 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: YES 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
AAATTGTGCA CATCCTGCAG C 

40 

(2) INFORMATION FOR SEQ ID NO: 3: 

<i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



4 
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(A) LENGTH: 50 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: YES 
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10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TAACGGTCAT TACGGCCATT GACTGTAGGA CCTGCATTAC AT GACTAGCT 50 

15 

(2) INFORMATION FOR SEQ ID NO : 4 : 

{i> SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

25~ (ii) MOLECULE TYPE: other nucleic acid 
(iii) HYPOTHETICAL: YES 



30 

<xi> SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
TAAAACGACG GGCCAGXG 17 

35 

(2) INFORMATION FOR SEQ ID NO: 5 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: other nucleic acid 



(iii) HYPOTHETICAL: YES 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
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XAAAACGACG 6GCCA6XG 



10 



17 



(2) INFORMATION FOR SEQ ID NO: 6: 



15 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
<B) TYPE: nucleic acid 

(C) 5TRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 (ii) MOLECULE TYPE: cDNA 



25 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



GTCACCCTCG ACCTGCAG 



18 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



TTGTAAAACG ACGGCCAGT 



19 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNES5 : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

15 CTTCCACCGC GATGTTGA IB 

(2) INFORMATION FOR SEQ ID NO:9: 

(i> SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: cDNA 

(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 9 : 

30 

CAGGAAACAG CTATGAC 17 

(2) INFORMATION FOR SEQ ID NO: 10: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 
<B> TYPE: nucleic acid 
i C ) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

40 

(ii) MOLECULE TYPE: cDNA 



XXXD: <WO 9737041 A3> 



WO 97/37041 

-84- 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GTAAAACGAC GGCCAGT 
5 (2) INFORMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(ix) FEATURE: 

(A) NAME/KEY: miscjeature 

(B) LOCATION: 1..19 

(D) OTHER INFORMATION: /note*= "All lowercase letters 
20 represent RiboG" 
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Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

25 GTCACCCTCG ACCTGCAgC 

(2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 
40 (B) LOCATION: 1..20 

(D) OTHER INFORMATION: /note= "All lowercase letters 
represent RiboG" 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 
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GTTGTAAAAC GAGGGCCAgT 



20 



5 (2) INFORMATION FOR SEQ ID NO: 13: 



10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



TCTGGCCTGG TGCAGGGCCT ATT GTAGTT G TGACGTACA 



39 



20 (2) INFORMATION FOR SEQ ID NO: 14: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: cDNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14; 



TCAACACTGC ATGT 



14 



35 (2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
5 AAGATCTGAC CAGGGATTCG GTTAGCGTGA CTGCTGCTGC TGCTGCTGCT GCTGGATGAT 60 
CCGACGCATC AGATCTGG 

(2) INFORMATION FOR SEQ ID NO: 16: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 <D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CTACTAGGCT GCGTAGTC 

25 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 

• . (B) TYPE: nucleic acid 

30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



18 



35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GATGATCCGA CGCATCACAG CTC 23 

40 (2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



10 CTACTAGGCT GCGTAGTGTC GAGAACCTTG GCT 



33 



(2) INFORMATION FOR SEQ ID NO: 19: 

<i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 23 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(ii) MOLECULE TYPE: cDNA 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



GATGATCCGA C G CAT CACAG CTC 



23 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
<AJ LENGTH: 18 base pairs 
(BJ TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



GTGATGCGTC G GAT CATC 



18 



(2) INFORMATION FOR SEQ ID NO: 21: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
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TCGGTTCCAA GAGCT 



15 
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CLAIMS 



1 . A method for determining the sequence of a nucleic acid, comprising the 
steps of: 



10 



15 



20 



25 



a) generating at least two base-specifically terminated nucleic acid 
fragments containing modified purine nucleotides that are relatively 
resistant to fragmentation during mass spectrometry; 

b) determining the molecular weight value of each base-specifically 
terminated fragment by mass spectrometry, wherein the molecular weight 
values of at least two base-specifically terminated fragments are 
determined concurrently; and 

c) determining the sequence of the nucleic acid by aligning the base-specifically 
terminated nucleic acid fragments according to molecular weight. 

2. The method according to claim 1 wherein the nucleic acid fragments are 
purified before the step of determining the molecular weight values by mass 
spectrometry. 

3. The method according to claim 2 wherein the nucleic acid fragments are 
purified, comprising the steps of: 

a) reversibly immobilizing the nucleic acid fragments on a solid 
support; and 

b) washing out all remaining reactants and by-products. 

4. The method according to claim 3, further comprising the step of removing the 
nucleic acid fragments from the solid support 
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5. The method of claim 1, wherein the fragments contain deazapurine moieties. 

6. The method of claim I, wherein the deaza purine moieties are selected from 
the group consisting of: C 7 -deazaadenine, C 7 -deazaguanine, 7-deazainosine 
triphosphate, C 9 -deazaadenine, C 9 -deazaguanine and C 9 -deazainosine 
triphosphate. 

7. The method of claim 1, wherein at least about 50% of the purine nucleotides 
are modified within the nucleotide fragment. 

8. A process of claim J wherein the mass spectrometer is selected from the 
group consisting of: Matrix-Assisted Laser Desorption/Ionization Time-of-Flight 
(MALDI-TOF), Electrospray (ES), Ion Cyclotron Resonance (ICR),and Fourier 
Transform and combinations thereof. 

9. The method according to claim 1, wherein more than one species of nucleic 
acid are concurrently sequenced by multiplex mass spectrometric nucleic acid 
sequencing employing nucleic acid primers, chain-elongating nucleotides, and 
chain-terminating nucleotides, wherein one of the sets of base-specifically 
terminated fragments is unmodified and the other sets of base-specifically 
terminated nucleic acid fragments are mass modified, and each of the sets of 
base-specifically terminated nucleic acid fragments has a sufficient mass 
difference to be distinguished from the others by mass spectrometry 

10. The method according to claim 9, wherein at least one of the sets of mass- 
modified base-specifically terminated fragments is modified with a mass- 
modifying functionality at a heterocyclic base of at least one nucleotide. 

1 1 The method according to claim 10, wherein the heterocyclic base-modified 
nucleotide is selected from the group consisting of a cytosine nucleotide 
modified at C-5, a thymine nucleotide modified at C-5, a thymine nucleotide 
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modified at the C-5 methyl group, a uracil nucleotide modified at C-5, an adenine 
nucleotide modified at C-8, an adenine nucleotide modified at C-7, a c 7 - 
deazaadenine modified at C-8, a c 7 -deaza adenine modified at C-7, a guanine 
nucleotide modified at C-8, a guanine nucleotide modified at C-7, a c 7 - 
5 deazaguanine modified at C-8, a c 7 -deazaguanine modified at C-7, a 

hypoxanthine modified at C-8, a c 7 -deazahypoxanthine modified at C-7, and a c 7 - 
deazahypoxan thine modified at C-8. 

12. The method according to claim 9, wherein at least one of the sets of mass- 
10 modified base-specifically terminated nucleic acid fragments is modified with a 

mass-modifying functionality attached to one or more phosphate moieties of the 
internucleotidic linkages of the fragments. 

1 3 . The method according to claim 9, wherein at least one of the sets of mass- 
1 S modified base-specifically terminated nucleic acid fragments is modified with a 

mass-modifying functionality attached to one or more sugar moieties of 
nucleotides within the set of mass modified base-specifically terminated 
fragments at at least one sugar position selected from the group consisting of a 
C-2' position, an external C- 3' position, and an external C-5* position. 

20 

14. The method according to claim 9, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments, is modified with a 
mass-modifying functionality (M) attached to the sugar moiety of a S'-terminal 
nucleotide and wherein the mass-modifying function (M) is the linking 

25 functionality (L). 



15 The method according to claim 9, wherein a mass-modifying functionality 
(M) is attached to a set of base-specifically terminated nucleic acid fragments 
subsequent to generating the base-specifically terminated nucleic acid fragments 
30 and prior to determining the molecular weight values for the nested fragments by 

mass spectrometry. 
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16. The method according to claim 15, wherein the base-specifically terminated 
nucleic acid fragments are generated using at least one reagent selected from the 
group consisting of a nucleic acid primer, a chain-elongating nucleotide, a chain- 
terminating nucleotide and a tag probe which has been modified with a precursor 
of the mass-modifying functionality, M; and a subsequent step comprises 
modifying the precursor of the mass-modifying functionality to generate the 
mass-modifying functionality, M, prior to mass spectrometry analysis. 

17. The method according to claim 9, wherein mass differentiation of the tag 
probes is achieved by changing the nucleotide composition of at least one of the 
tag probes and complementary tag sequence in the species of nucleic acid. 

1 8. The method according to claim 9, wherein the tag probes are covalently 
bound to the corresponding complementary tag sequence prior to mass 
spectrometric analysis. 



19. The method according to claim 18, wherein binding between the tag 
probes and the corresponding complementary tag sequences is achieved 
photochemically via photoactivatable groups. 

20 A method of sequencing a nucleic acid, comprising the steps of: 

a) reyereibly linking an oligonucleotide primer to a solid support; 

b) generating at least two base-specifically terminated nucleic acid 
fragments containing nucleotides that are relatively resistant to 
fragmentation during mass spectrometry; 

c) determining the molecular weight value of each nested fragment 

in each of the four sets of base-specifically terminated fragments of the nucleic 
acid by matrix assisted laser desorption/ionization mass spectrometr 
wherein the molecular weight values of at least two base-specifically terminated 
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15 



20 



25 



fragments are determined concurrently and wherein the nested fragments are 
cleaved from the solid support by a laser during mass spectrometry; and 

d) determining the nucleotide sequence by aligning the base 
specifically terminated fragments according to molecular weight. 

2 1 . The method according to claim 20 wherein the nucleic acid fragments are 
purified before the step of determining the molecular weight values by mass 
spectrometry. 

22. The method according to claim 21 wherein the nucleic acid fragments are 
purified, comprising the steps of: 

a) reversibly immobilizing the nucleic acid fragments on a solid 
support; and 

b) washing out all remaining reactants and by-products. 

23. The method according to claim 22, further comprising the step of 
removing the nucleic acid fragments from the solid support. 

24. The method of claim 20, wherein the fragments conxain deazapurine 
moieties. 

25. The method of claim 20 % wherein the deaza purine moieties are selected 

7 7 
from the group consisting of: C -deazaadenine, C -deazaguanine, 7- 

9 9 9 

deazainosine triphosphate, C -deazaadenine, C -deazaguamne and C - 

deazainosine triphosphate. 

26. The method of claim 20, wherein at least about 50% of the purine 
nucleotides are modified within the nucleotide fragment. 
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27. A process of claim 20 wherein the mass spectrometer is selected from the 
group consisting of: Matrix-Assisted Laser Desorption/lonization Time-of-Flight 
(MALDI-TOF), Electrospray (ES), Ion Cyclotron Resonance (ICR), and Fourier 
Transform and combinations thereof. 

5 

28. The method according to claim 20, wherein more than one species of nucleic 
acid are concurrently sequenced by multiplex mass spectrometry nucleic acid 
sequencing employing nucleic acid primers, chain-elongating nucleotides, and 
chain-terminating nucleotides, wherein one of the sets of base-specifically 

10 terminated fragments is unmodified and the other sets of base-specifically 

terminated nucleic acid fragments are mass modified, and each of the sets of 
base-specifically terminated nucleic acid fragments has a sufficient mass 
difference to be distinguished from the others by mass 
spectrometry. 

15 

29. The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated fragments is modified with a mass- 
modifying functionality (M) at a heterocyclic base of at least one nucleotide. 

20 30 The method according to claim 29, wherein the heterocyclic base-modified 

nucleotide is selected from the group consisting of a cytosine nucleotide 
modified at C-5, a, thymine nucleotide modified at C-5, a thymine nucleotide 
modified at the C-5 methyl group, a uracil nucleotide modified at C-5, an adenine 
nucleotide modified at C-8, an adenine nucleotide modified at C-7, a c ? - 

25 deazaadenine modified at C-8, a c -deazaadenine modified at C-7, a guanine 

nucleotide modified at C-8, a guanine nucleotide modified at C-7, a c ? - 

7 

deazaguanine modified at C-8, a c -deazaguanine modified at C-7, a 

hypoxanthine modified at C-8, a c ? -deazahypoxanthine modified at C-7, and a 
7 

c -deazahypoxanthine modified at C-8. 



30 



3 1 The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
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mass-modifying functionality (M) attached to one or more phosphate moieties of 
the intemucleotidic linkages of the fragments. 

32. The method according to claim 28, wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
mass-modifying functionality (M) attached to one or more sugar moieties of 
nucleotides within the set of mass modified base-specifically terminated 
fragments at at least one sugar position selected from the group consisting of a _. 
C-2' position, an external C- 3' position, and an external C-5* position. 



10 



33. The method according to claim 28 t wherein at least one of the sets of mass- 
modified base-specifically terminated nucleic acid fragments is modified with a 
mass-modifying functionality (M) attached to the sugar moiety of a S'-terminal 
nucleotide and wherein the mass-modifying function (M) is the Unking 

15 functionality (L). 

34. The method according to claim 28, wherein a mass-modifying functionality 
(M) is attached to a set of base-specifically terminated nucleic acid fragments 
subsequent to generating the base-specifically terminated nucleic acid fragments 

20 and prior to determining the molecular weight values for the nested fragments by 

mass spectrometry. 

35. The method according to claim 34, wherein the base-specifically 
terminated nucleic acid fragments are generated using at least one reagent 

25 selected from the group consisting of a nucleic acid primer, a chain-elongating 

nucleotide, a chain-terminating nucleotide and a tag probe which has been 
modified with a precursor of the mass-modifying functionality, M; and a 
subsequent step comprises modifying the precursor of the mass-modifying 
functionality, M, to generate the mass-modifying functionality, M, prior to mass 

30 spectrometry analysis. 
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36. The method according to claim 28, wherein mass differentiation of the tag 
probes is achieved by changing the nucleotide composition of at least one of the 
tag probes and complementary tag sequence in the species of nucleic acid. 

37. The method according to claim 28, wherein the tag probes are covalently 
bound to the corresponding complementary tag sequence prior to mass 
spectrometric analysis. 

38. The method according to claim 37, wherein binding between the tag probes 
and the corresponding complementary tag sequences is achieved 
photochemical ly via photoactivatable groups. 1 

39. A method of multiplex analysis of nucleic acid sequences, comprising the 
steps of: 

a) reversibly linking a nucleic acid primer to a solid support; 

b) generating at least two conditioned, base-specifically terminated nucleic acid 
fragments containing modified purine nucleotides that are relatively resistant to 
fragmentation during mass spectrometry; 

c) determining the molecular weight value of each fragment by matrix assisted 
laser desorption/ionization mass spectrometry wherein the molecular weight 
values of at least two base-specifically terminated fragments are determined 
concurrently and wherein the fragments are cleaved from the solid support by a 
laser during mass spectrometry; and 

d) determining the nucleotide sequence by aligning the fragments according to 
molecular weight; wherein at least one reagent selected from a group consisting 
of, a nucleic acid primer, a chain-elongating nucleotide, and a chain-terminating 
nucleotide which has been mass-modified; wherein each set of base-specifically 
terminated fragments has a sufficient mass difference from the other sets of base- 
specifically terminated fragments so as to be unique; and wherein the molecular 
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weight values of the nested fragments of two or more sets of unseparated base- 
specifically terminated fragments are determined concurrently. 

40. The method according to claim 39, wherein the reversible linkage is 
aphotocleavable bond. 

4 1 . The method according to claim 39 wherein the base- specifically terminated 
fragments are cleaved from the solid support prior to mass spectrometry. 
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FIG. 12 
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FIG. 13 
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FIG. I6A 
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FIG.I6C 
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FIG.I6G 
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FIG. 161 
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FIG.I6K 
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